Appendix for: Invertible Gaussian Reparameterization
–Neural Information Processing Systems
As mentioned in section 3.1, we can use the matrix determinant lemma to efficiently compute the determinant of the Jacobian of the softmax Proof: For k = 1,..., K 1, we have: P(H = k) = Note that the involved integrals are one-dimensional and thus can be accurately approximated with quadrature methods. As mentioned in the main manuscript, our VAE experiments closely follow Maddison et al. [4]: we use the same continuous objective and the same evaluation metrics. Using the former KL results in optimizing a continuous objective which is not a log-likelihood lower bound anymore, which is mainly why we followed Maddison et al. [4]. In addition to the reported comparisons in the main manuscript, we include further comparisons in Table 1 reporting the discretized training ELBO instead. These are variance reduction techniques which heavily lean on the GS to improve the variance of the obtained gradients.
Neural Information Processing Systems
May-30-2025, 03:08:48 GMT