Many recent advances in large scale probabilistic inference rely on variational methods. The success of variational approaches depends on (i) formulating a flexible parametric family of distributions, and (ii) optimizing the parameters to find the member of this family that most closely approximates the exact posterior. In this paper we present a new approximating family of distributions, the variational sequential Monte Carlo (VSMC) family, and show how to optimize it in variational inference. VSMC melds variational inference (VI) and sequential Monte Carlo (SMC), providing practitioners with flexible, accurate, and powerful Bayesian inference. The VSMC family is a variational family that can approximate the posterior arbitrarily well, while still allowing for efficient optimization of its parameters. We demonstrate its utility on state space models, stochastic volatility models for financial data, and deep Markov models of brain neural circuits.
The question of how to determine the number of independent latent factors (topics) in mixture models such as Latent Dirichlet Allocation (LDA) is of great practical importance. In most applications, the exact number of topics is unknown, and depends on the application and the size of the data set. Bayesian nonparametric methods can avoid the problem of topic number selection, but they can be impracticably slow for large sample sizes and are subject to local optima. We develop a guaranteed procedure for topic number recovery that does not necessitate learning the model's latent parameters beforehand. Our procedure relies on adapting results from random matrix theory. Performance of our topic number recovery procedure is superior to hLDA, a nonparametric method. We also discuss some implications of our results on the sample complexity and accuracy of popular spectral learning algorithms for LDA. Our results and procedure can be extended to spectral learning algorithms for other exchangeable mixture models as well as Hidden Markov Models.
In this paper we demonstrate that tempering Markov chain Monte Carlo samplers for Bayesian models by recursively subsampling observations without replacement can improve the performance of baseline samplers in terms of effective sample size per computation. We present two tempering by subsampling algorithms, subsampled parallel tempering and subsampled tempered transitions. We provide an asymptotic analysis of the computational cost of tempering by subsampling, verify that tempering by subsampling costs less than traditional tempering, and demonstrate both algorithms on Bayesian approaches to learning the mean of a high dimensional multivariate Normal and estimating Gaussian process hyperparameters.
Approximate Bayesian computation (ABC) is a powerful and elegant framework for performing inference in simulation-based models. However, due to the difficulty in scaling likelihood estimates, ABC remains useful for relatively low-dimensional problems. We introduce Hamiltonian ABC (HABC), a set of likelihood-free algorithms that apply recent advances in scaling Bayesian learning using Hamiltonian Monte Carlo (HMC) and stochastic gradients. We find that a small number forward simulations can effectively approximate the ABC gradient, allowing Hamiltonian dynamics to efficiently traverse parameter spaces. We also describe a new simple yet general approach of incorporating random seeds into the state of the Markov chain, further reducing the random walk behavior of HABC. We demonstrate HABC on several typical ABC problems, and show that HABC samples comparably to regular Bayesian inference using true gradients on a high-dimensional problem from machine learning.
We propose a new algorithm to do posterior sampling of Kingman's coalescent, based upon the Particle Markov Chain Monte Carlo methodology. Specifically, the algorithm is an instantiation of the Particle Gibbs Sampling method, which alternately samples coalescent times conditioned on coalescent tree structures, and tree structures conditioned on coalescent times via the conditional Sequential Monte Carlo procedure. We implement our algorithm as a C++ package, and demonstrate its utility via a parameter estimation task in population genetics on both single- and multiple-locus data. The experiment results show that the proposed algorithm performs comparable to or better than several well-developed methods.