Mixture of von Mises-Fisher distribution with sparse prototypes
Rossi, Fabrice, Barbaro, Florian
–arXiv.org Artificial Intelligence
Mixtures of von Mises-Fisher distributions can be used to cluster data on the unit hypersphere. This is particularly adapted for high-dimensional directional data such as texts. We propose in this article to estimate a von Mises mixture using a l 1 penalized likelihood. This leads to sparse prototypes that improve clustering interpretability. We introduce an expectation-maximisation (EM) algorithm for this estimation and explore the trade-off between the sparsity term and the likelihood one with a path following algorithm. The model's behaviour is studied on simulated data and, we show the advantages of the approach on real data benchmark. We also introduce a new data set on financial reports and exhibit the benefits of our method for exploratory analysis.
arXiv.org Artificial Intelligence
Dec-30-2022
- Country:
- Europe (0.92)
- North America > United States
- New York (0.28)
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Banking & Finance
- Financial Services (0.66)
- Trading (0.92)
- Banking & Finance
- Technology: