Dual-Clustering Maximum Entropy with Application to Classification and Word Embedding

Wang, Xiaolong (University of Illinois ) | Wang, Jingjing (University of Illinois) | Zhai, Chengxiang (University of Illinois)

Feb-14-2017–AAAI Conferences

Maximum Entropy (ME), as a general-purpose machine learning model, has been successfully applied to various fields such as text mining and natural language processing. It has been used as a classification technique and recently also applied to learn word embedding. ME establishes a distribution of the exponential form over items (classes/words). When training such a model, learning efficiency is guaranteed by globally updating the entire set of model parameters associated with all items at each training instance. This creates a significant computational challenge when the number of items is large. To achieve learning efficiency with affordable computational cost, we propose an approach named Dual-Clustering Maximum Entropy (DCME). Exploiting the primal-dual form of ME, it conducts clustering in the dual space and approximates each dual distribution by the corresponding cluster center. This naturally enables a hybrid online-offline optimization algorithm whose time complexity per instance only scales as the product of the feature/word vector dimensionality and the cluster number. Experimental studies on text classification and word embedding learning demonstrate that DCME effectively strikes a balance between training speed and model quality, substantially outperforming state-of-the-art methods.

artificial intelligence, machine learning, natural language, (16 more...)

AAAI Conferences

Feb-14-2017

Conferences PDF

Add feedback

Country:
- North America > United States > Illinois (0.28)

Genre:
- Research Report
  - New Finding (0.48)
  - Experimental Study (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Statistical Learning
    - Maximum Entropy (0.82)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found