analytical probability distribution
Analytical Probability Distributions and Exact Expectation-Maximization for Deep Generative Networks
Deep Generative Networks (DGNs) with probabilistic modeling of their output and latent space are currently trained via Variational Autoencoders (VAEs). In the absence of a known analytical form for the posterior and likelihood expectation, VAEs resort to approximations, including (Amortized) Variational Inference (AVI) and Monte-Carlo sampling. We exploit the Continuous Piecewise Affine property of modern DGNs to derive their posterior and marginal distributions as well as the latter's first two moments. These findings enable us to derive an analytical Expectation-Maximization (EM) algorithm for gradient-free DGN learning. We demonstrate empirically that EM training of DGNs produces greater likelihood than VAE training. Our new framework will guide the design of new VAE AVI that better approximates the true posterior and open new avenues to apply standard statistical tools for model comparison, anomaly detection, and missing data imputation.
Review for NeurIPS paper: Analytical Probability Distributions and Exact Expectation-Maximization for Deep Generative Networks
Summary and Contributions: Deep generative models (DGMs), specifically variational autoencoders (VAEs), currently rely on variational inference and stochastic optimization of a lower bound to maximize likelihood since the analytic likelihood cannot be computed in general. This paper shows that in fact the likelihood can be computed analytically and maximized with analytic expectation maximization (EM) updates when the network uses affine piecewise nonlinearities like ReLU and leaky-ReLU. The key insight is that these networks induces a partition of the latent space that can be handled tractably when the prior and likelihood are both Gaussian. This paper analytically derives the posterior distribution, the marginal distribution, the expectation of the complete likelihood (for the E step), and the updates to the parameters (for the M step). These novel derivations allows the authors to perform EM on DGMs for the first time.
Analytical Probability Distributions and Exact Expectation-Maximization for Deep Generative Networks
Deep Generative Networks (DGNs) with probabilistic modeling of their output and latent space are currently trained via Variational Autoencoders (VAEs). In the absence of a known analytical form for the posterior and likelihood expectation, VAEs resort to approximations, including (Amortized) Variational Inference (AVI) and Monte-Carlo sampling. We exploit the Continuous Piecewise Affine property of modern DGNs to derive their posterior and marginal distributions as well as the latter's first two moments. These findings enable us to derive an analytical Expectation-Maximization (EM) algorithm for gradient-free DGN learning. We demonstrate empirically that EM training of DGNs produces greater likelihood than VAE training.