Statistical Guarantees for V ariational Autoencoders using PAC-Bayesian Theory: Supplementary Material

Neural Information Processing Systems 

For example, the product measure p q is a coupling of p and q . This expression can be greatly simplified when the distributions have diagonal covariance matrices. A proof can be found in Boucheron et al. (2013, We state and prove our first result. Note that the following lemma does not use Assumption 1. Moreover, the main difference between the inequality of this lemma and the one of Theorem 3.1 is the In Lemma B.1, the expected loss for samples In contrast, in Theorem 3.1, the expected loss for each Markov's inequality to the positive random variable Y, defined as Y This, combined with (B.2) yields λ n Combining this equation with Equation (B.3) yields the theorem.