Goto

Collaborating Authors

 image synthesis








Supplementary Material for Semantic Image Synthesis with Unconditional Generator JungWoo Chae

Neural Information Processing Systems

This process enables the value (feature maps) to be rearranged (through a weighted sum) to align with the form of the query, thereby reflecting their strong correspondence. The input noise is removed because its stochasticity slows down the training. Given the need for balancing between high correspondence and image quality, we empirically set the weights of our loss terms. To demonstrate the influence of the additional losses introduced in our method, we provide both quantitative and qualitative ablations in Figure S2 and S3, respectively. Nonetheless, caution is warranted when overly increasing the number of clusters.


SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow

Neural Information Processing Systems

Semantic segmentation and semantic image synthesis are two representative tasks in visual perception and generation. While existing methods consider them as two distinct tasks, we propose a unified framework (SemFlow) and model them as a pair of reverse problems. Specifically, motivated by rectified flow theory, we train an ordinary differential equation (ODE) model to transport between the distributions of real images and semantic masks. As the training object is symmetric, samples belonging to the two distributions, images and semantic masks, can be effortlessly transferred reversibly. For semantic segmentation, our approach solves the contradiction between the randomness of diffusion outputs and the uniqueness of segmentation results. For image synthesis, we propose a finite perturbation approach to enhance the diversity of generated results without changing the semantic categories. Experiments show that our SemFlow achieves competitive results on semantic segmentation and semantic image synthesis tasks. We hope this simple framework will motivate people to rethink the unification of low-level and high-level vision.


Semantic Image Synthesis with Unconditional Generator

Neural Information Processing Systems

Semantic image synthesis (SIS) aims to generate realistic images according to semantic masks given by a user. Although recent methods produce high quality results with fine spatial control, SIS requires expensive pixel-level annotation of the training images. On the other hand, manipulating intermediate feature maps in a pretrained unconditional generator such as StyleGAN supports coarse spatial control without heavy annotation. In this paper, we introduce a new approach, for reflecting user's detailed guiding masks on a pretrained unconditional generator. Our method converts a user's guiding mask to a proxy mask through a semantic mapper.


Learning Semantic-aware Normalization for Generative Adversarial Networks

Neural Information Processing Systems

The recent advances in image generation have been achieved by style-based image generators. Such approaches learn to disentangle latent factors in different image scales and encode latent factors as "style" to control image synthesis. However, existing approaches cannot further disentangle fine-grained semantics from each other, which are often conveyed from feature channels. In this paper, we propose a novel image synthesis approach by learning Semantic-aware relative importance for feature channels in Generative Adversarial Networks (SariGAN).