Maximal Margin Labeling for Multi-Topic Text Categorization

Kazawa, Hideto, Izumitani, Tomonori, Taira, Hirotoshi, Maeda, Eisaku

Neural Information Processing Systems 

In this paper, we address the problem of statistical learning for multitopic textcategorization (MTC), whose goal is to choose all relevant topics (a label) from a given set of topics. The proposed algorithm, Maximal MarginLabeling (MML), treats all possible labels as independent classes and learns a multi-class classifier on the induced multi-class categorization problem.To cope with the data sparseness caused by the huge number of possible labels, MML combines some prior knowledge about label prototypes and a maximal margin criterion in a novel way. Experiments withmulti-topic Web pages show that MML outperforms existing learning algorithms including Support Vector Machines.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found