L-EnsNMF: Boosted local topic discovery via ensemble of nonnegative matrix factorization

Sangho Suh, Jaegul Choo, Joonseok Lee, Chandan K. Reddy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Citations (Scopus)

Abstract

Nonnegative matrix factorization (NMF) has beenwidely applied in many domains. In document analysis, it hasbeen increasingly used in topic modeling applications, where aset of underlying topics are revealed by a low-rank factor matrixfrom NMF. However, it is often the case that the resulting topicsgive only general topic information in the data, which tends notto convey much information. To tackle this problem, we proposea novel ensemble model of nonnegative matrix factorizationfor discovering high-quality local topics. Our method leveragesthe idea of an ensemble model, which has been successfulin supervised learning, into an unsupervised topic modelingcontext. That is, our model successively performs NMF givena residual matrix obtained from previous stages and generatesa sequence of topic sets. Our algorithm for updating the inputmatrix has novelty in two aspects. The first lies in utilizing theresidual matrix inspired by a state-of-The-Art gradient boostingmodel, and the second stems from applying a sophisticatedlocal weighting scheme on the given matrix to enhance thelocality of topics, which in turn delivers high-quality, focusedtopics of interest to users. We evaluate our proposed method bycomparing it against other topic modeling methods, such as afew variants of NMF and latent Dirichlet allocation, in termsof various evaluation measures representing topic coherence, diversity, coverage, computing time, and so on. We also presentqualitative evaluation on the topics discovered by our methodusing several real-world data sets.

Original languageEnglish
Title of host publicationProceedings - 16th IEEE International Conference on Data Mining, ICDM 2016
EditorsFrancesco Bonchi, Xindong Wu, Ricardo Baeza-Yates, Josep Domingo-Ferrer, Zhi-Hua Zhou
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages479-488
Number of pages10
ISBN (Electronic)9781509054725
DOIs
Publication statusPublished - 2017 Jan 31
Event16th IEEE International Conference on Data Mining, ICDM 2016 - Barcelona, Catalonia, Spain
Duration: 2016 Dec 122016 Dec 15

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Conference

Conference16th IEEE International Conference on Data Mining, ICDM 2016
Country/TerritorySpain
CityBarcelona, Catalonia
Period16/12/1216/12/15

Keywords

  • Ensemble learning
  • Gradient boosting
  • Local weighting
  • Matrix factorization
  • Topic modeling

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint

Dive into the research topics of 'L-EnsNMF: Boosted local topic discovery via ensemble of nonnegative matrix factorization'. Together they form a unique fingerprint.

Cite this