Topic document model approach for naive Bayes text classification

Sang Bum Kim, Hae Chang Rim, Jin Dong Kim

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

The multinomial naive Bayes model has been widely used for probabilistic text classification. However, the parameter estimation for this model sometimes generates inappropriate probabilities. In this paper, we propose a topic document model for the multinomial naive Bayes text classification, where the parameters are estimated from normalized term frequencies of each training document. Experiments are conducted on Reuters 21578 and 20 Newsgroup collections, and our proposed approach obtained a significant improvement in performance compared to the traditional multinomial naive Bayes.

Original languageEnglish
Pages (from-to)1091-1094
Number of pages4
JournalIEICE Transactions on Information and Systems
VolumeE88-D
Issue number5
DOIs
Publication statusPublished - 2005

Keywords

  • Naive Bayes
  • Text classification

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Topic document model approach for naive Bayes text classification'. Together they form a unique fingerprint.

Cite this