stylegan2
The GAN is dead; long live the GAN! A Modern GAN Baseline
There is a widely-spread claim that GANs are difficult to train, and GAN architectures in the literature are littered with empirical tricks. We provide evidence against this claim and build a modern GAN baseline in a more principled manner. First, we derive a well-behaved regularized relativistic GAN loss that addresses issues of mode dropping and non-convergence that were previously tackled via a bag of ad-hoc tricks. We analyze our loss mathematically and prove that it admits local convergence guarantees, unlike most existing relativistic losses. Second, this loss allows us to discard all ad-hoc tricks and replace outdated backbones used in common GANs with modern architectures. Using StyleGAN2 as an example, we present a roadmap of simplification and modernization that results in a new minimalist baseline---R3GAN. Despite being simple, our approach surpasses StyleGAN2 on FFHQ, ImageNet, CIFAR, and Stacked MNIST datasets, and compares favorably against state-of-the-art GANs and diffusion models.
Compose Visual Relations
A large brown metal cube belowa large green rubber cylinder A large gray metal sphereabove a small red metal cube A small red metal cube behinda large brown metal cube A large brown metal cube below a large green rubber cylinder A large gray metal sphereabove a small red metal cube A small red metal cube on the left of a large brown metal cube A large brown metal cube below a large green rubber cylinder A blue objectinfrontofa gray object! A gray object on the left ofa green object A green object behindablue object! A blue objectin front ofa gray object! A gray object behind a green object! A green object on the left ofa blue object! A blue object behind a gray object A gray object on the left ofa green object A green object on the right ofa gray object CLIPQuery imageFine-tuned CLIPOurs( a) Top 1 image-text retrieval result on i Gibsonscenes.(