Goto

Collaborating Authors

 class-conditional sample



Reviews: Triple Generative Adversarial Nets

Neural Information Processing Systems

In this paper, the authors propose a new formulation of adversarial networks for image generation, that incorporates three networks instead of the usual generator G and discriminator D. In addition, they include a classifier C, which cooperates with G to learn a compatible joint distribution (X,Y) over images and labels. The authors show how this formulation overcomes pitfalls of previous class-conditional GANs; namely that class-conditional generator and discriminator networks have competing objectives that may prevent them from learning the true distribution and preventing G from accurately generating class-conditional samples. The authors identify the following deficiency in class-conditional GAN setups: "The competition between G and D essentially arises from their two-player formulation, where a single discriminator network has to play two incompatible roles--identifying fake samples and predicting labels". The argument goes that if G were perfect, then a class-conditional D has an equal incentive to output 0 since the sample comes from G, and to output 1 since the image matches the label. This might force D to systematically underperform as a classifier, and therefore prevent G from learning to produce accurate class-conditional samples.


Reviews: Memory Replay GANs: Learning to Generate New Categories without Forgetting

Neural Information Processing Systems

Update following the author rebuttal: I would like to thank the authors for their thoughtful rebuttal. I feel like they appropriately addressed the main points I raised, namely the incomplete evaluation and the choice of GANs over other generative model families, and I'm inclined to recommend the paper's acceptance. I updated my review score accordingly. The paper is well-written and its exposition of the problem, proposed solution, and related work is clear. Starting from the AC-GAN conditional generative modeling formulation, the authors introduce the notion of a sequence of tasks by modeling image classes (for MNIST, SVHN, and LSUN) in sequence, where the model for each class in the sequence is initialized with the model parameters for the previous class in the sequence.


No MCMC for me: Amortized sampling for fast and stable training of energy-based models

Grathwohl, Will, Kelly, Jacob, Hashemi, Milad, Norouzi, Mohammad, Swersky, Kevin, Duvenaud, David

arXiv.org Artificial Intelligence

Energy-Based Models (EBMs) present a flexible and appealing way to represent uncertainty. Despite recent advances, training EBMs on high-dimensional data remains a challenging problem as the state-of-the-art approaches are costly, unstable, and require considerable tuning and domain expertise to apply successfully. In this work, we present a simple method for training EBMs at scale which uses an entropy-regularized generator to amortize the MCMC sampling typically used in EBM training. We improve upon prior MCMC-based entropy regularization methods with a fast variational approximation. We demonstrate the effectiveness of our approach by using it to train tractable likelihood models. Next, we apply our estimator to the recently proposed Joint Energy Model (JEM), where we match the original performance with faster and stable training. This allows us to extend JEM models to semi-supervised classification on tabular data from a variety of continuous domains.