Goto

Collaborating Authors

 semantic image synthesis




Supplementary Material for Semantic Image Synthesis with Unconditional Generator JungWoo Chae

Neural Information Processing Systems

This process enables the value (feature maps) to be rearranged (through a weighted sum) to align with the form of the query, thereby reflecting their strong correspondence. The input noise is removed because its stochasticity slows down the training. Given the need for balancing between high correspondence and image quality, we empirically set the weights of our loss terms. To demonstrate the influence of the additional losses introduced in our method, we provide both quantitative and qualitative ablations in Figure S2 and S3, respectively. Nonetheless, caution is warranted when overly increasing the number of clusters.



Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

Neural Information Processing Systems

Semantic image synthesis aims at generating photorealistic images from semantic layouts. Previous approaches with conditional generative adversarial networks (GAN) show state-of-the-art performance on this task, which either feed the semantic label maps as inputs to the generator, or use them to modulate the activations in normalization layers via affine transformations. We argue that convolutional kernels in the generator should be aware of the distinct semantic labels at different locations when generating images. In order to better exploit the semantic layout for the image generator, we propose to predict convolutional kernels conditioned on the semantic label map to generate the intermediate feature maps from the noise maps and eventually generate the images. Moreover, we propose a feature pyramid semantics-embedding discriminator, which is more effective in enhancing fine details and semantic alignments between the generated images and the input semantic layouts than previous multi-scale discriminators. We achieve state-of-the-art results on both quantitative metrics and subjective evaluation on various semantic segmentation datasets, demonstrating the effectiveness of our approach.


Semantic Image Synthesis with Unconditional Generator

Neural Information Processing Systems

Semantic image synthesis (SIS) aims to generate realistic images according to semantic masks given by a user. Although recent methods produce high quality results with fine spatial control, SIS requires expensive pixel-level annotation of the training images. On the other hand, manipulating intermediate feature maps in a pretrained unconditional generator such as StyleGAN supports coarse spatial control without heavy annotation. In this paper, we introduce a new approach, for reflecting user's detailed guiding masks on a pretrained unconditional generator. Our method converts a user's guiding mask to a proxy mask through a semantic mapper.


SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow

Neural Information Processing Systems

For image synthesis, we propose a finite perturbation approach to enhance the diversity of generated results without changing the semantic categories. Experiments show that our SemFlow achieves competitive results on semantic segmentation and semantic image synthesis tasks.


Supplementary Material for Semantic Image Synthesis with Unconditional Generator

Neural Information Processing Systems

This process enables the value (feature maps) to be rearranged (through a weighted sum) to align with the form of the query, thereby reflecting their strong correspondence. The input noise is removed because its stochasticity slows down the training. Given the need for balancing between high correspondence and image quality, we empirically set the weights of our loss terms. To demonstrate the influence of the additional losses introduced in our method, we provide both quantitative and qualitative ablations in Figure S2 and S3, respectively. Nonetheless, caution is warranted when overly increasing the number of clusters.