Collaborating Authors

On How Well Generative Adversarial Networks Learn Densities: Nonparametric and Parametric Results Machine Learning

We study in this paper the rate of convergence for learning distributions with the Generative Adversarial Networks (GAN) framework, which subsumes Wasserstein, Sobolev and MMD GANs as special cases. We study a wide range of parametric and nonparametric target distributions, under a collection of objective evaluation metrics. On the nonparametric end, we investigate the minimax optimal rates and fundamental difficulty of the density estimation under the adversarial framework. On the parametric end, we establish theory for neural network classes, that characterizes the interplay between the choice of generator and discriminator. We investigate how to improve the GAN framework with better theoretical guarantee through the lens of regularization. We discover and isolate a new notion of regularization, called the \textit{generator/discriminator pair regularization}, that sheds light on the advantage of GAN compared to classic parametric and nonparametric approaches for density estimation.

Bayesian Conditional Generative Adverserial Networks Machine Learning

Traditional GANs use a deterministic generator function (typically a neural network) to transform a random noise input $z$ to a sample $\mathbf{x}$ that the discriminator seeks to distinguish. We propose a new GAN called Bayesian Conditional Generative Adversarial Networks (BC-GANs) that use a random generator function to transform a deterministic input $y'$ to a sample $\mathbf{x}$. Our BC-GANs extend traditional GANs to a Bayesian framework, and naturally handle unsupervised learning, supervised learning, and semi-supervised learning problems. Experiments show that the proposed BC-GANs outperforms the state-of-the-arts.

Stackelberg GAN: Towards Provable Minimax Equilibrium via Multi-Generator Architectures Machine Learning

Generative Adversarial Nets (GANs) are emerging objects of study in machine learning, computer vision, natural language processing, and many other domains. In machine learning, study of such a framework has led to significant advances in adversarial defenses [28, 24] and machine security [4, 24]. In computer vision and natural language processing, GANs have resulted in improved performance over standard generative models for images and texts [13], such as variational autoencoder [16] and deep Boltzmann machine [22]. A main technique to achieve this goal is to play a minimax two-player game between generator and discriminator under the design that the generator tries to confuse the discriminator with its generated contents and the discriminator tries to distinguish real images/texts from what the generator creates. Despite a large amount of variants of GANs, many fundamental questions remain unresolved. One of the longstanding challenges is designing universal, easy-to-implement architectures that alleviate the instability issue of GANs training. Ideally, GANs are supposed to solve the minimax optimization problem [13], but in practice alternating gradient descent methods do not clearly privilege minimax over maximin or vice versa (page 35, [12]), which may lead to instability in training if there exists a large discrepancy between the minimax and maximin objective values. The focus of this work is on improving the stability of such minimax game in the training process of GANs. 1 Under review as a conference paper at ICLR 2019

Geometric GAN Machine Learning

Generative Adversarial Nets (GANs) represent an important milestone for effective generative models, which has inspired numerous variants seemingly different from each other. One of the main contributions of this paper is to reveal a unified geometric structure in GAN and its variants. Specifically, we show that the adversarial generative model training can be decomposed into three geometric steps: separating hyperplane search, discriminator parameter update away from the separating hyperplane, and the generator update along the normal vector direction of the separating hyperplane. This geometric intuition reveals the limitations of the existing approaches and leads us to propose a new formulation called geometric GAN using SVM separating hyperplane that maximizes the margin. Our theoretical analysis shows that the geometric GAN converges to a Nash equilibrium between the discriminator and generator. In addition, extensive numerical results show that the superior performance of geometric GAN.

Investigating Under and Overfitting in Wasserstein Generative Adversarial Networks Machine Learning

We investigate under and overfitting in Generative Adversarial Networks (GANs), using discriminators unseen by the generator to measure generalization. We find that the model capacity of the discriminator has a significant effect on the generator's model quality, and that the generator's poor performance coincides with the discriminator underfitting. Contrary to our expectations, we find that generators with large model capacities relative to the discriminator do not show evidence of overfitting on CIFAR10, CIFAR100, and CelebA.