Goto

Collaborating Authors

 f-vi


Amortized Variational Inference: When and Why?

Margossian, Charles C., Blei, David M.

arXiv.org Machine Learning

Variational inference is a class of methods to approximate the posterior distribution of a probabilistic model. The classic factorized (or mean-field) variational inference (F-VI) fits a separate parametric distribution for each latent variable. The more modern amortized variational inference (A-VI) instead learns a common \textit{inference function}, which maps each observation to its corresponding latent variable's approximate posterior. Typically, A-VI is used as a cog in the training of variational autoencoders, however it stands to reason that A-VI could also be used as a general alternative to F-VI. In this paper we study when and why A-VI can be used for approximate Bayesian inference. We establish that A-VI cannot achieve a better solution than F-VI, leading to the so-called \textit{amortization gap}, no matter how expressive the inference function is. We then address a central theoretical question: When can A-VI attain F-VI's optimal solution? We derive conditions on the model which are necessary, sufficient, and verifiable under which the amortization gap can be closed. We show that simple hierarchical models, which encompass many models in machine learning and Bayesian statistics, verify these conditions. We demonstrate, on a broader class of models, how to expand the domain of AVI's inference function to improve its solution, and we provide examples, e.g. hidden Markov models, where the amortization gap cannot be closed. Finally, when A-VI can match F-VI's solution, we empirically find that the required complexity of the inference function does not grow with the data size and that A-VI often converges faster.


Generalized Variational Inference

Knoblauch, Jeremias, Jewson, Jack, Damoulas, Theodoros

arXiv.org Artificial Intelligence

This paper introduces a generalized representation of Bayesian inference. It is derived axiomatically, recovering existing Bayesian methods as special cases. We use it to prove that variational inference (VI) based on the Kullback-Leibler Divergence with a variational family Q produces the uniquely optimal Q-constrained approximation to the exact Bayesian inference problem. Surprisingly, this implies that standard VI dominates any other Q-constrained approximation to the exact Bayesian inference problem. This means that alternative Q-constrained approximations such as VI targeted at minimizing other divergences and Expectation Propagation can produce better posteriors than VI only by implicitly targeting more appropriate Bayesian inference problems. Inspired by this, we introduce Generalized Variational Inference (GVI), a modular approach for instead solving such alternative inference problems explicitly. We explore some applications of GVI, including robustness and better marginals. Lastly, we derive black box GVI and apply it to Bayesian Neural Networks as well as Deep Gaussian Processes, where GVI comprehensively outperforms competing methods.