Goto

Collaborating Authors

 ad-lda


b51a15f382ac914391a58850ab343b00-Reviews.html

Neural Information Processing Systems

The authors perform an analysis of "Hogwild" parallel Gibbs sampling for Gaussian distributions and show a connection between Gauss-Seidel / Jacobi and the Hogwild routine. They exploit this connection to show conditions for when this parallel Gibbs sampling process converges to the correct mean, and they are also able to make statements about the covariances of the system. I enjoyed reading the connection between Gauss-Seidel and Gauss-Jacobi and parallel Gaussian Gibbs sampling and find that this type of analysis is very useful for the NIPS community as parallel Gibbs sampling has received relatively little theoretical attention. A few comments: 1) I guess for the simple case of Gaussians, parallel Gibbs sampling is overkill as one can just directly obtain samples quickly for any multivariate Gaussian (but of course it is useful for analysis purposes). It would be nice to point this out (not sure if I agree with the sentiment in line 69) and also to make a few statements about how this analysis could also be applied to non-Gaussian cases. It seems that for a given iteration t, the same set of v samples would have to be globally shared across all processors for this method to work (e.g., Eq. 4)?


Sparse Partially Collapsed MCMC for Parallel Inference in Topic Models

arXiv.org Machine Learning

Topic models, and more specifically the class of Latent Dirichlet Allocation (LDA), are widely used for probabilistic modeling of text. MCMC sampling from the posterior distribution is typically performed using a collapsed Gibbs sampler. We propose a parallel sparse partially collapsed Gibbs sampler and compare its speed and efficiency to state-of-the-art samplers for topic models on five well-known text corpora of differing sizes and properties. In particular, we propose and compare two different strategies for sampling the parameter block with latent topic indicators. The experiments show that the increase in statistical inefficiency from only partial collapsing is smaller than commonly assumed, and can be more than compensated by the speedup from parallelization and sparsity on larger corpora. We also prove that the partially collapsed samplers scale well with the size of the corpus. The proposed algorithm is fast, efficient, exact, and can be used in more modeling situations than the ordinary collapsed sampler.