Reviews: DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors

Neural Information Processing Systems 

A key aspect of these works is retaining the ability to train the models with low-variance reparameterisation trick based gradient estimates of the variational objective by relaxing the discrete latent variables with associated continuous valued variables. Of particular significance to this submission are the discrete VAE (dVAE) (Rolfe, 2016) and dVAE (Vahdat et al., 2018) models which use a Boltzmann machine (BM) prior on the discrete latent variables and construct a differentiable proxy variational objective by introducing continuous variables zeta corresponding to relaxations of the discrete variables z, with \zeta depending on z via a *smoothing* conditional distribution r(\zeta z) . The generative process in the decoder model is specified such that generated outputs x are conditionally independent of the discrete variables z given the continuous variables \zeta . An issue identified with the (differentiable proxy) variational objective used in both the dVAE and dVAE approaches is that it is not amenable to being formulated as an importance-weighted bound, with importance-weighted objectives for continuous VAE models having been found to give significant improvements in training performance (Burda et al., 2015). In this submission the authors suggest an alternative dVAE formulation they term dVAE# which is able to use an importance weighted objective.