Goto

Collaborating Authors

 w-cgan


Stable Parallel Training of Wasserstein Conditional Generative Adversarial Neural Networks

Pasini, Massimiliano Lupo, Yin, Junqi

arXiv.org Artificial Intelligence

We propose a stable, parallel approach to train Wasserstein Conditional Generative Adversarial Neural Networks (W-CGANs) under the constraint of a fixed computational budget. Differently from previous distributed GANs training techniques, our approach avoids inter-process communications, reduces the risk of mode collapse and enhances scalability by using multiple generators, each one of them concurrently trained on a single data label. The use of the Wasserstein metric also reduces the risk of cycling by stabilizing the training of each generator. We illustrate the approach on the CIFAR10, CIFAR100, and ImageNet1k datasets, three standard benchmark image datasets, maintaining the original resolution of the images for each dataset. Performance is assessed in terms of scalability and final accuracy within a limited fixed computational time and computational resources. To measure accuracy, we use the inception score, the Frechet inception distance, and image quality. An improvement in inception score and Frechet inception distance is shown in comparison to previous results obtained by performing the parallel approach on deep convolutional conditional generative adversarial neural networks (DC-CGANs) as well as an improvement of image quality of the new images created by the GANs approach. Weak scaling is attained on both datasets using up to 2,000 NVIDIA V100 GPUs on the OLCF supercomputer Summit.


Adversarial training for predictive tasks: theoretical analysis and limitations in the deterministic case

Lesieur, Thibault, Messud, Jérémie, Hammoud, Issa, Peng, Hanyuan, Lacombe, Céline, Jeunesse, Paulien

arXiv.org Machine Learning

To train a deep neural network to mimic the outcomes of processing sequences, a version of Conditional Generalized Adversarial Network (CGAN) can be used. It has been observed by others that CGAN can help to improve the results even for deterministic sequences, where only one output is associated with the processing of a given input. Surprisingly, our CGAN-based tests on deterministic geophysical processing sequences did not produce a real improvement compared to the use of an $L_p$ loss; we here propose a first theoretical explanation why. Our analysis goes from the non-deterministic case to the deterministic one. It led us to develop an adversarial way to train a content loss that gave better results on our data.