Multi-turn Editing 1 Enabling Instructional2 Image Editing with3 In-Context 4 5 Generation in Large Scale Diffusion Transformer

Neural Information Processing Systems 

Instruction-based image editing enables precise modifications via natural language prompts, but existing methods face a precision-efficiency tradeoff: fine-tuning demands massive datasets (>10M) and computational resources, while trainingfree approaches suffer from weak instruction comprehension.