proofofthm
b9523d484af624986c2e0c630ac44ecb-Supplemental-Conference.pdf
Lemma B.4. (Lemma 2.1.8 in [4]) For any diffeomorphismf Diffkc Rd and any δ > 0, there exists a finite sequence of(δ,k)-near-identity diffeomorphismsg1,,gs such that f = gs gs 1 g1. Let πi: Rd R denote the projection onto theith coordinate. Supposef: Rd Rd is compactly supported and sufficientlyCk-close to the identity. In this section, we analysis how to make the affine coupling flow with dimension-augmentation invertible. Tohandle this problem, we need to makesure thatRange(F)is tractable for easy sampling.
cf0d02ec99e61a64137b8a2c3b03e030-Supplemental.pdf
Lemma 5. Let S = (Z1,...,Zn) be a collection ofn independent random variables andΦ be an arbitrary random variable defined on the same probability space. Furthermore, each of these summands has zero mean. Given a deterministic algorithmf, we consider the algorithm that adds Gaussian noise to the predictionsoff: fσ(z,x,R)=f(z,x)+ξ, (151) whereξ N(0,σ2Id). Thefigureinthemiddle repeats the experiment of Figure 1a while making the training algorithm stochastic by randomizing the seed. Table 1: The architecture of the 4-layer convolutional neural network used in MNIST 4 vs 9 classification tasks.