Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance

Neural Information Processing Systems 

Recent controllable generation approaches such as FreeControl and Diffusion Self-Guidance bring fine-grained spatial and appearance control to text-to-image (T2I) diffusion models without training auxiliary modules.