DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling
–Neural Information Processing Systems
Understanding the dynamic physical world, characterized by its evolving 3D structure, real-world motion, and semantic content with textual descriptions, is crucial for human-agent interaction and enables embodied agents to perceive and act within real environments with human like capabilities. However, existing datasets are often derived from limited simulators or utilize traditional Structure-from-Motion for up-to-scale annotation and offer limited descriptive captioning, which restricts the capacity of foundation models to accurately interpret real-world dynamics from monocular videos, commonly sourced from the internet.
Neural Information Processing Systems
Jun-13-2026, 12:17:40 GMT
- Technology: