Latent Dirichlet Allocation

Blei, David M., Ng, Andrew Y., Jordan, Michael I.

Dec-31-2002–Neural Information Processing Systems

We propose a generative model for text and other collections of discrete datathat generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hofmann's aspectmodel, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical resultson applications of this model to problems in text modeling, collaborative filtering, and text classification. 1 Introduction Recent years have seen the development and successful application of several latent factor models for discrete data. One notable example, Hofmann's pLSI/aspect model [3], has received the attention of many researchers, and applications have emerged in text modeling [3], collaborative filtering [7], and link analysis [1].

artificial intelligence, lda, text processing, (20 more...)

Neural Information Processing Systems

Dec-31-2002

Conferences PDF

Add feedback

Country:
- North America > United States > California > Alameda County > Berkeley (0.14)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Learning Graphical Models
    - Directed Networks > Bayesian Learning (0.89)
  - Natural Language > Text Processing (1.00)

Duplicate Docs Excel Report

Title
Latent Dirichlet Allocation
Latent Dirichlet Allocation

Similar Docs Excel Report more

Title	Similarity	Source
None found