AITopics | predict layout-to-image conditional convolution

Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

Neural Information Processing SystemsDec-25-2025, 21:40:51 GMT

Semantic image synthesis aims at generating photorealistic images from semantic layouts. Previous approaches with conditional generative adversarial networks (GAN) show state-of-the-art performance on this task, which either feed the semantic label maps as inputs to the generator, or use them to modulate the activations in normalization layers via affine transformations. We argue that convolutional kernels in the generator should be aware of the distinct semantic labels at different locations when generating images. In order to better exploit the semantic layout for the image generator, we propose to predict convolutional kernels conditioned on the semantic label map to generate the intermediate feature maps from the noise maps and eventually generate the images. Moreover, we propose a feature pyramid semantics-embedding discriminator, which is more effective in enhancing fine details and semantic alignments between the generated images and the input semantic layouts than previous multi-scale discriminators. We achieve state-of-the-art results on both quantitative metrics and subjective evaluation on various semantic segmentation datasets, demonstrating the effectiveness of our approach.

name change, predict layout-to-image conditional convolution, semantic image synthesis, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.64)

Add feedback

Reviews: Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

Neural Information Processing SystemsJan-26-2025, 13:43:48 GMT

This paper proposes a strongly conditional network for generating images from semantic maps. How impacted is this network by small changes in the input map - for example given 3 sequential frames of a video (as segmentation maps) - is the model consistent in assigning colors and structures? Or do small changes in the geometry of the semantic objects have a large impact on the output? This is mostly curiousity, as having smoothness inherent in the model has large potential for video applications. Some amount of qualitative results comparing to other models were shown, but showing the important regions of the input conditioning, and the influence of input perturbations on the model output could also lead to valuable insight - using something like GradCAM or related methods may be possible for checking the importance of input features.

artificial intelligence, predict layout-to-image conditional convolution, semantic image synthesis, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.40)

Add feedback

Reviews: Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

Neural Information Processing SystemsJan-26-2025, 13:43:37 GMT

All reviewers are in unanimous agreement for acceptance. The paper has a number of interesting contributions, mostly empirical, in utilizing a conditional weights network and feature pyramids. As promised in your rebuttal, please release code before acceptance.

learning, predict layout-to-image conditional convolution, semantic image synthesis, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.40)

Add feedback

Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

Neural Information Processing SystemsOct-10-2024, 18:43:53 GMT

Semantic image synthesis aims at generating photorealistic images from semantic layouts. Previous approaches with conditional generative adversarial networks (GAN) show state-of-the-art performance on this task, which either feed the semantic label maps as inputs to the generator, or use them to modulate the activations in normalization layers via affine transformations. We argue that convolutional kernels in the generator should be aware of the distinct semantic labels at different locations when generating images. In order to better exploit the semantic layout for the image generator, we propose to predict convolutional kernels conditioned on the semantic label map to generate the intermediate feature maps from the noise maps and eventually generate the images. Moreover, we propose a feature pyramid semantics-embedding discriminator, which is more effective in enhancing fine details and semantic alignments between the generated images and the input semantic layouts than previous multi-scale discriminators.

predict layout-to-image conditional convolution, semantic image synthesis, semantic layout, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Artificial Intelligence > Vision (0.65)

Add feedback

Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

Liu, Xihui, Yin, Guojun, Shao, Jing, Wang, Xiaogang, Li, hongsheng

Neural Information Processing SystemsMar-18-2020, 20:32:06 GMT

Semantic image synthesis aims at generating photorealistic images from semantic layouts. Previous approaches with conditional generative adversarial networks (GAN) show state-of-the-art performance on this task, which either feed the semantic label maps as inputs to the generator, or use them to modulate the activations in normalization layers via affine transformations. We argue that convolutional kernels in the generator should be aware of the distinct semantic labels at different locations when generating images. In order to better exploit the semantic layout for the image generator, we propose to predict convolutional kernels conditioned on the semantic label map to generate the intermediate feature maps from the noise maps and eventually generate the images. Moreover, we propose a feature pyramid semantics-embedding discriminator, which is more effective in enhancing fine details and semantic alignments between the generated images and the input semantic layouts than previous multi-scale discriminators.

predict layout-to-image conditional convolution, semantic image synthesis, semantic layout, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.74)
Information Technology > Artificial Intelligence > Vision (0.65)

Add feedback

Filters

Collaborating Authors

predict layout-to-image conditional convolution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

Reviews: Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

Reviews: Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis