Goto

Collaborating Authors

 recoverkl



Reviews: Geometric Dirichlet Means Algorithm for topic inference

Neural Information Processing Systems

I like this paper for two different reasons. After RecoverKL and the spectral algorithm, this paper brings a very novel and useful perspective into the topic inference problem for LDA, without apparently making strong assumptions about topics, such as separability via anchor words, etc. Secondly, it seems to be extremely good in practice meeting the speed of RecoverKL with the accuracy of Gibbs sampling algorithms. A. The algorithm: Aspects of this work were known before. For example, Blei pointed out the convex geometry in the original LDA paper, and the connection between LDA/NMF and K-Means was also known. However, the novel aspect of this paper is that it has used these connections to propose an inference algorithm for LDA completely based on the geometry of the topic and word simplexes. This is done by making an additional connection between the topic inference problem and that of Centroidal Voronoi Tesselations of a convex simplex.


Geometric Dirichlet Means algorithm for topic inference

Neural Information Processing Systems

We propose a geometric algorithm for topic learning and inference that is built on the convex geometry of topics arising from the Latent Dirichlet Allocation (LDA) model and its nonparametric extensions. To this end we study the optimization of a geometric loss function, which is a surrogate to the LDA's likelihood. Our method involves a fast optimization based weighted clustering procedure augmented with geometric corrections, which overcomes the computational and statistical inefficiencies encountered by other techniques based on Gibbs sampling and variational inference, while achieving the accuracy comparable to that of a Gibbs sampler. The topic estimates produced by our method are shown to be statistically consistent under some conditions. The algorithm is evaluated with extensive experiments on simulated and real data.


Geometric Dirichlet Means Algorithm for topic inference

Yurochkin, Mikhail, Nguyen, XuanLong

Neural Information Processing Systems

We propose a geometric algorithm for topic learning and inference that is built on the convex geometry of topics arising from the Latent Dirichlet Allocation (LDA) model and its nonparametric extensions. To this end we study the optimization of a geometric loss function, which is a surrogate to the LDA's likelihood. Our method involves a fast optimization based weighted clustering procedure augmented with geometric corrections, which overcomes the computational and statistical inefficiencies encountered by other techniques based on Gibbs sampling and variational inference, while achieving the accuracy comparable to that of a Gibbs sampler. The topic estimates produced by our method are shown to be statistically consistent under some conditions. The algorithm is evaluated with extensive experiments on simulated and real data.


Geometric Dirichlet Means algorithm for topic inference

Yurochkin, Mikhail, Nguyen, XuanLong

arXiv.org Machine Learning

We propose a geometric algorithm for topic learning and inference that is built on the convex geometry of topics arising from the Latent Dirichlet Allocation (LDA) model and its nonparametric extensions. To this end we study the optimization of a geometric loss function, which is a surrogate to the LDA's likelihood. Our method involves a fast optimization based weighted clustering procedure augmented with geometric corrections, which overcomes the computational and statistical inefficiencies encountered by other techniques based on Gibbs sampling and variational inference, while achieving the accuracy comparable to that of a Gibbs sampler. The topic estimates produced by our method are shown to be statistically consistent under some conditions. The algorithm is evaluated with extensive experiments on simulated and real data.