Distributed Inference for Latent Dirichlet Allocation

Apr-6-2023, 14:37:34 GMT–Neural Information Processing Systems

We investigate the problem of learning a widely-used latent-variable model – the Latent Dirichlet Allocation (LDA) or "topic" model – using distributed compu- of the total data set. We pro- tation, where each of pose two distributed inference schemes that are motivated from different perspec- tives. The first scheme uses local Gibbs sampling on each processor with periodic updates--it is simple to implement and can be viewed as an approximation to a single processor implementation of Gibbs sampling. The second scheme re- lies on a hierarchical Bayesian extension of the standard LDA model to directly processors--it has a theo- account for the fact that data are distributed across retical guarantee of convergence but is more complex to implement than the ap- proximate method. Using five real-world text corpora we show that distributed learning works very well for LDA models, i.e., perplexity and precision-recall scores for distributed learning are indistinguishable from those obtained with single-processor learning.

inference, latent dirichlet allocation, processor, (2 more...)

Neural Information Processing Systems

Apr-6-2023, 14:37:34 GMT

Conferences Web Page

Add feedback

Country:
- South America > Paraguay > Asunción > Asunción (0.09)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Learning Graphical Models (0.88)
  - Natural Language
    - Discourse & Dialogue (0.99)
    - Text Processing (0.65)