LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration

Neural Information Processing Systems 

Text-to-image (T2I) generation has made remarkable progress, yet existing systems still lack intuitive control over spatial composition, object consistency, and multi-step editing.