Sparse Parallel Training of Hierarchical Dirichlet Process Topic Models

Terenin, Alexander, Magnusson, Måns, Jonsson, Leif

Jun-6-2019–arXiv.org Machine Learning

Nonparametric extensions of topic models such as Latent Dirichlet Allocation, including Hierarchical Dirichlet Process (HDP), are often studied in natural language processing. Training these models generally requires use of serial algorithms, which limits scalability to large data sets and complicates acceleration via use of parallel and distributed systems. Most current approaches to scalable training of such models either don't converge to the correct target, or are not data-parallel. Moreover, these approaches generally do not utilize all available sources of sparsity found in natural language - an important way to make computation efficient. Based upon a representation of certain conditional distributions within an HDP, we propose a doubly sparse data-parallel sampler for the HDP topic model that addresses these issues.

algorithm, bayesian inference, immunology, (22 more...)

arXiv.org Machine Learning

Jun-6-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States > California (0.14)

Genre:
- Research Report (0.82)

Industry:
- Health & Medicine > Therapeutic Area
  - Immunology (0.68)
  - Infections and Infectious Diseases (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Learning Graphical Models
    - Directed Networks > Bayesian Learning (0.69)
  - Natural Language > Discourse & Dialogue (1.00)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found