I experimented with generating faces of cats using Generative adversarial networks (GAN). I wanted to try DCGAN, WGAN and WGAN-GP in low and higher resolutions. I used the CAT dataset (yes this is a real thing!) for my training sample. This dataset has 10k pictures of cats. I centered the images on the kitty faces and I removed outliers (I did this from visual inspection, it took a couple of hours…).
Generative Adversarial Network (GAN) is a current focal point of research. The body of knowledge is fragmented, leading to a trial-error method while selecting an appropriate GAN for a given scenario. We provide a comprehensive summary of the evolution of GANs starting from its inception addressing issues like mode collapse, vanishing gradient, unstable training and non-convergence. We also provide a comparison of various GANs from the application point of view, its behaviour and implementation details. We propose a novel framework to identify candidate GANs for a specific use case based on architecture, loss, regularization and divergence. We also discuss application of the framework using an example, and we demonstrate a significant reduction in search space. This efficient way to determine potential GANs lowers unit economics of AI development for organizations.
Generative adversarial networks (GANs) have been shown to produce realistic samples from high-dimensional distributions, but training them is considered hard. A possible explanation for training instabilities is the inherent imbalance between the networks: While the discriminator is trained directly on both real and fake samples, the generator only has control over the fake samples it produces since the real data distribution is fixed by the choice of a given dataset. We propose a simple modification that gives the generator control over the real samples which leads to a tempered learning process for both generator and discriminator. The real data distribution passes through a lens before being revealed to the discriminator, balancing the generator and discriminator by gradually revealing more detailed features necessary to produce high-quality results. The proposed module automatically adjusts the learning process to the current strength of the networks, yet is generic and easy to add to any GAN variant. In a number of experiments, we show that this can improve quality, stability and/or convergence speed across a range of different GAN architectures (DCGAN, LSGAN, WGAN-GP).
We propose Unbalanced GANs, which pre-trains the generator of the generative adversarial network (GAN) using variational autoencoder (VAE). We guarantee the stable training of the generator by preventing the faster convergence of the discriminator at early epochs. Furthermore, we balance between the generator and the discriminator at early epochs and thus maintain the stabilized training of GANs. We apply Unbalanced GANs to well known public datasets and find that Unbalanced GANs reduce mode collapses. We also show that Unbalanced GANs outperform ordinary GANs in terms of stabilized learning, faster convergence and better image quality at early epochs.
Generative Adversarial Networks (GANs) are powerful generative models, but suffer from training instability. The recently proposed Wasserstein GAN (WGAN) makes progress toward stable training of GANs, but sometimes can still generate only low-quality samples or fail to converge. We find that these problems are often due to the use of weight clipping in WGAN to enforce a Lipschitz constraint on the critic, which can lead to undesired behavior. We propose an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input. Our proposed method performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning, including 101-layer ResNets and language models over discrete data. We also achieve high quality generations on CIFAR-10 and LSUN bedrooms.