SEGA: Instructing Text-to-Image Models using Semantic Guidance
–Neural Information Processing Systems
Text-to-image diffusion models have recently received a lot of interest for their astonishing ability to produce high-fidelity images from text only. However, achieving one-shot generation that aligns with the user's intent is nearly impossible, yet small changes to the input prompt often result in very different images. This leaves the user with little semantic control. To put the user in control, we show how to interact with the diffusion process to flexibly steer it along semantic directions.
Neural Information Processing Systems
Dec-25-2025, 05:01:02 GMT
- Technology: