TextDiffuser: Diffusion Models as Text Painters
–Neural Information Processing Systems
TextDiffuser consists of two stages: first, a Transformer model generates the layout of keywords extracted from text prompts, and then diffusion models generate images conditioned on the text prompt and the generated layout.
Neural Information Processing Systems
Feb-19-2026, 02:07:35 GMT