Language-driven Scene Synthesis using Multi-conditional Diffusion Model

Jan-18-2025, 20:20:50 GMT–Neural Information Processing Systems

Scene synthesis is a challenging problem with several industrial applications. Recently, substantial efforts have been directed to synthesize the scene using human motions, room layouts, or spatial graphs as the input. However, few studies have addressed this problem from multiple modalities, especially combining text prompts. In this paper, we propose a language-driven scene synthesis task, which is a new task that integrates text prompts, human motion, and existing objects for scene synthesis. Unlike other single-condition synthesis tasks, our problem involves multiple conditions and requires a strategy for processing and encoding them into a unified space.

language-driven scene synthesis, multi-conditional diffusion model, synthesis, (5 more...)

Neural Information Processing Systems

Jan-18-2025, 20:20:50 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.59)