Semi-Supervised Multinomial Naive Bayes for Text Classification by Leveraging Word-Level Statistical Constraint

Zhao, Li (Tsinghua University) | Huang, Minlie (Tsinghua University) | Yao, Ziyu (Beijing University of Posts and Telecommunications) | Su, Rongwei (Samsung Research and Development Institute China - Beijing) | Jiang, Yingying (Samsung Research and Development Institute China - Beijing) | Zhu, Xiaoyan (Tsinghua University)

AAAI Conferences 

Multinomial Naive Bayes with Expectation Maximization (MNB-EM) is a standard semi-supervised learning method to augment Multinomial Naive Bayes (MNB) for text classification. Despite its success, MNB-EM is not stable, and may succeed or fail to improve MNB. We believe that this is because MNB-EM lacks the ability to preserve the class distribution on words. In this paper, we propose a novel method to augment MNB-EM by leveraging the word-level statistical constraint to preserve the class distribution on words. The word-level statistical constraints are further converted to constraints on document posteriors generated by MNB-EM. Experiments demonstrate that our method can consistently improve MNB-EM, and outperforms state-of-art baselines remarkably.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found