Hierarchical learning of grids of microtopics

Jojic, Nebojsa, Perina, Alessandro, Kim, Dongwoo

Jun-8-2016–arXiv.org Machine Learning

The counting grid is a grid of microtopics, sparse word/feature distributions. The generative model associated with the grid does not use these microtopics individually, but in predefined groups which can only be (ad)mixed as such. Each allowed group corresponds to one of all possible overlapping rectangular windows into the grid. The capacity of the model is controlled by the ratio of the grid size and the window size. This paper builds upon the basic counting grid model and it shows that hierarchical reasoning helps avoid bad local minima, produces better classification accuracy and, most interestingly, allows for extraction of large numbers of coherent microtopics even from small datasets. We evaluate this in terms of consistency, diversity and clarity of the indexed content, as well as in a user study on word intrusion tasks. We demonstrate that these models work well as a technique for embedding raw images and discuss interesting parallels between hierarchical CG models and other deep architectures.

algorithm, grid, topic model, (17 more...)

arXiv.org Machine Learning

Jun-8-2016

arXiv.org PDF

Add feedback

Country:
- South America > Paraguay
  - Asunción > Asunción (0.04)
- Oceania > Australia
  - Australian Capital Territory > Canberra (0.04)
- North America > United States
  - Washington > King County
    - Redmond (0.04)
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.04)
  - New York > New York County
    - New York City (0.04)
  - California > San Francisco County
    - San Francisco (0.14)
- Asia
  - Middle East > Jordan (0.04)
  - Thailand (0.04)
  - Indonesia > Sumatra (0.04)
  - China > Tibet Autonomous Region (0.04)

Genre:
- Research Report (1.00)

Industry:
- Energy > Renewable (0.46)
- Health & Medicine > Therapeutic Area
  - Infections and Infectious Diseases (0.46)
  - Immunology (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found