A Training Energy based Priors using

Neural Information Processing Systems 

In this section, we show how a V AE with energy-based model in its prior can be trained. In this section, we discuss how maximizing the variational bound in V AEs from the prior's perspective H (q ( z)) as the minimization is with respect to the parameters of the prior p (z). The binary classifier is composed of two types of residual blocks as in Figure 1. Residual blocks used in the binary classifier. An excitation operation (non-linear transformation) is applied to these values to get per-channel weights.