logpm
8 SupplementaryMaterial
For the GLOW experiment we stacked three GLOW transformations at different scales eachwitheightaffinecoupling blocks spaced byactnorms andpermutations each parameterized byaCNN with twohidden layers with 512 filters each. In a recent arXiv submission, Arjovsky et al.[2] suggested that in the presence of an observable variability intheenvironmente(e.g. While this procedure workedondistributions that were very similar tobegin with, inthe majority of cases the log-likelihood fit toB did not provide informative gradients when evaluated on the transformed dataset, as the KL-divergence between distributions with disjoint supports is infinite. The code is available in lrmf_gradient_simulation.ipynb. LRMF objective(Eq 2) decreases over time and reaches zero when two datasets are aligned.