infogan
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets
This paper describes InfoGAN, an information-theoretic extension to the Generative Adversarial Network that is able to learn disentangled representations in a completely unsupervised manner. InfoGAN is a generative adversarial network that also maximizes the mutual information between a small subset of the latent variables and the observation. We derive a lower bound to the mutual information objective that can be optimized efficiently, and show that our training procedure can be interpreted as a variation of the Wake-Sleep algorithm. Specifically, InfoGAN successfully disentangles writing styles from digit shapes on the MNIST dataset, pose from lighting of 3D rendered images, and background digits from the central digit on the SVHN dataset. It also discovers visual concepts that include hair styles, presence/absence of eyeglasses, and emotions on the CelebA face dataset. Experiments show that InfoGAN learns interpretable representations that are competitive with representations learned by existing fully supervised methods.
IB-GAN: Disentangled Representation Learning with Information Bottleneck Generative Adversarial Networks
Jeon, Insu, Lee, Wonkwang, Pyeon, Myeongjang, Kim, Gunhee
We propose a new GAN-based unsupervised model for disentangled representation learning. The new model is discovered in an attempt to utilize the Information Bottleneck (IB) framework to the optimization of GAN, thereby named IB-GAN. The architecture of IB-GAN is partially similar to that of InfoGAN but has a critical difference; an intermediate layer of the generator is leveraged to constrain the mutual information between the input and the generated output. The intermediate stochastic layer can serve as a learnable latent distribution that is trained with the generator jointly in an end-to-end fashion. As a result, the generator of IB-GAN can harness the latent space in a disentangled and interpretable manner. With the experiments on dSprites and Color-dSprites dataset, we demonstrate that IB-GAN achieves competitive disentanglement scores to those of state-of-the-art \b{eta}-VAEs and outperforms InfoGAN. Moreover, the visual quality and the diversity of samples generated by IB-GAN are often better than those by \b{eta}-VAEs and Info-GAN in terms of FID score on CelebA and 3D Chairs dataset.
- Asia > Middle East > Jordan (0.05)
- Asia > South Korea > Seoul > Seoul (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Reviews: InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets
It is a good paper that should definitely be accepted. The presented approach has a clear theoretical motivation and is supported by a thorough and convincing experimental evaluation. It is important that the approach does not use any domain-specific knowledge and effectively comes at zero additional computational cost. This makes it easily applicable to a wide range of generative tasks. I have several questions/comments: 1) It seems to me that the proposed approach in the end amounts to training a GAN with an additional network (or an additional branch on top of the discriminator) trained to predict part of the latent code from the generated image.
Reviews: Graphical Generative Adversarial Networks
This paper proposes Graphical-GAN, a variant of GAN that combines the expressivity of Graphical Models (in particular, Bayesian nets) with the generative inductive bias of Generative Adversarial Networks. For highly structured latent variables, such as the ones considered in this work, the discriminator's task of distinguishing X,Z samples from the two distributions can be different. As a second major contribution, the work proposes a learning procedure inspired by Expectation Propogation (EP). Here, the factorization structure of the graphical model is explicitly exploited to make the task of the discriminator "easier" by comparing only subsets of variables. Finally, the authors perform experiments for controlled generation using a GAN model with a mixture of Gaussians prior, and a State-Space structure to empirically validate their approach.
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets
This paper describes InfoGAN, an information-theoretic extension to the Generative Adversarial Network that is able to learn disentangled representations in a completely unsupervised manner. InfoGAN is a generative adversarial network that also maximizes the mutual information between a small subset of the latent variables and the observation. We derive a lower bound of the mutual information objective that can be optimized efficiently. Specifically, InfoGAN successfully disentangles writing styles from digit shapes on the MNIST dataset, pose from lighting of 3D rendered images, and background digits from the central digit on the SVHN dataset. It also discovers visual concepts that include hair styles, presence/absence of eyeglasses, and emotions on the CelebA face dataset. Experiments show that InfoGAN learns interpretable representations that are competitive with representations learned by existing supervised methods. For an up-to-date version of this paper, please see https://arxiv.org/abs/1606.03657.