Hierarchical mixtures of Unigram models for short text clustering: the role of Beta-Liouville priors

Nov-14-2024–arXiv.org Machine Learning

This paper presents a variant of the Multinomial mixture model tailored for the unsupervised classification of short text data. Traditionally, the Multinomial probability vector in this hierarchical model is assigned a Dirichlet prior distribution. Here, however, we explore an alternative prior--the Beta-Liouville distribution--which offers a more flexible correlation structure than the Dirichlet. We examine the theoretical properties of the Beta-Liouville distribution, focusing on its conjugacy with the Multinomial likelihood. This property enables the derivation of update equations for a CAVI (Coordinate Ascent Variational Inference) variational algorithm, facilitating the approximate posterior estimation of model parameters. Additionally, we propose a stochastic variant of the CAVI algorithm that enhances scalability. The paper concludes with data examples that demonstrate effective strategies for setting the Beta-Liouville hyperparameters.

beta-liouville distribution, dirichlet distribution, expression, (14 more...)

arXiv.org Machine Learning

Nov-14-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - California > Santa Clara County
    - Palo Alto (0.04)
- Europe
  - Austria > Vienna (0.14)
  - Italy (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (1.00)
- Instructional Material > Course Syllabus & Notes (0.46)

Industry:
- Banking & Finance (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (1.00)
  - Machine Learning
    - Statistical Learning (0.67)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found