Gaussian variational approximation with sparse precision matrices

Tan, Linda S. L., Nott, David J.

arXiv.org Machine Learning 

The stochastic gradients constructed in this manner are "doubly stochastic" as they are built upon two sources of stochasticity that comes from sampling from the variational distribution and the full data set. This approach is very general in that it can be applied to any model where the joint density is differentiable. Unlike variational Bayes, it does not assume independence relationships among blocks of an appropriate partition of θ. Such independence assumptions have been shown to result in underestimation of the posterior variance (Wang and Titterington, 2005; Bishop, 2006). The quality of the resulting approximation is thus limited only by how well the form of q(θ) matches the true posterior. Using this approach, Kucukelbir et al. (2016) develop an automatic differentiation variational inference (ADVI) algorithm in Stan, where q(θ) is assumed to be either a diagonal (meanfield) or unrestricted Gaussian variational approximation. Constrained variables are transformed to the real line via Stan's library of transformations and the gradients are computed using Monte Carlo integration. They note that while unrestricted ADVI is able to capture posterior correlations and hence produces more accurate marginal variance estimates than mean field ADVI, it can be prohibitively slow for large data since the number of variational parameters scales as the square of the length of θ. In this article, we consider variational approximations which take the form of a multivariate Gaussian distribution N(µ, Σ) for models with high-dimensional parameters (µ denotes the mean and Σ the covariance matrix).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found