Detailed Human-Centric Text Description-Driven Large Scene Synthesis