Mixture of von Mises-Fisher distribution with sparse prototypes
Rossi, Fabrice, Barbaro, Florian
–arXiv.org Artificial Intelligence
Mixtures of von Mises-Fisher distributions can be used to cluster data on the unit hypersphere. This is particularly adapted for high-dimensional directional data such as texts. We propose in this article to estimate a von Mises mixture using a l 1 penalized likelihood. This leads to sparse prototypes that improve clustering interpretability. We introduce an expectation-maximisation (EM) algorithm for this estimation and explore the trade-off between the sparsity term and the likelihood one with a path following algorithm. The model's behaviour is studied on simulated data and, we show the advantages of the approach on real data benchmark. We also introduce a new data set on financial reports and exhibit the benefits of our method for exploratory analysis.
arXiv.org Artificial Intelligence
Dec-30-2022
- Country:
- Asia > China (0.04)
- North America
- Canada (0.04)
- United States
- Wisconsin > Dane County
- Madison (0.04)
- Texas > Harris County
- Houston (0.04)
- New York > New York County
- New York City (0.14)
- Indiana > Hamilton County
- Fishers (0.04)
- Wisconsin > Dane County
- Europe
- United Kingdom > England
- Greater London > London (0.04)
- Spain > Andalusia
- Cádiz Province > Cadiz (0.04)
- Iceland > Capital Region
- Reykjavik (0.04)
- France > Île-de-France
- United Kingdom > England
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Banking & Finance
- Trading (0.92)
- Financial Services (0.66)
- Banking & Finance
- Technology: