Goto

Collaborating Authors

 topic polytope




Reviews: Geometric Dirichlet Means Algorithm for topic inference

Neural Information Processing Systems

I like this paper for two different reasons. After RecoverKL and the spectral algorithm, this paper brings a very novel and useful perspective into the topic inference problem for LDA, without apparently making strong assumptions about topics, such as separability via anchor words, etc. Secondly, it seems to be extremely good in practice meeting the speed of RecoverKL with the accuracy of Gibbs sampling algorithms. A. The algorithm: Aspects of this work were known before. For example, Blei pointed out the convex geometry in the original LDA paper, and the connection between LDA/NMF and K-Means was also known. However, the novel aspect of this paper is that it has used these connections to propose an inference algorithm for LDA completely based on the geometry of the topic and word simplexes. This is done by making an additional connection between the topic inference problem and that of Centroidal Voronoi Tesselations of a convex simplex.


Streaming dynamic and distributed inference of latent geometric structures

Yurochkin, Mikhail, Fan, Zhiwei, Guha, Aritra, Koutris, Paraschos, Nguyen, XuanLong

arXiv.org Machine Learning

The topic or population polytope (Nguyen, 2015; Tang et al., 2014) is a fundamental geometric object that underlies the presence of latent topic variables in topic and admixture models (Blei et al., 2003; Pritchard et al., 2000). When data and the associated topics are indexed by time dimension, it is of interest to study the temporal dynamics of such latent geometric structures. In this paper, we will study the modeling and algorithms for learning the temporal dynamics of topic polytope that arises in the analysis of text corpora. The convex geometry of topic models provides the theoretical basis for posterior contraction analysis of latent topics (Nguyen, 2015; Tang et al., 2014). Furthermore, Yurochkin & Nguyen (2016); Yurochkin et al. (2017) exploited convex geometry to develop fast and quite accurate inference algorithms in a number of parametric and nonparametric settings.


Geometric Dirichlet Means algorithm for topic inference

Yurochkin, Mikhail, Nguyen, XuanLong

arXiv.org Machine Learning

We propose a geometric algorithm for topic learning and inference that is built on the convex geometry of topics arising from the Latent Dirichlet Allocation (LDA) model and its nonparametric extensions. To this end we study the optimization of a geometric loss function, which is a surrogate to the LDA's likelihood. Our method involves a fast optimization based weighted clustering procedure augmented with geometric corrections, which overcomes the computational and statistical inefficiencies encountered by other techniques based on Gibbs sampling and variational inference, while achieving the accuracy comparable to that of a Gibbs sampler. The topic estimates produced by our method are shown to be statistically consistent under some conditions. The algorithm is evaluated with extensive experiments on simulated and real data.