TraceGen: World Modeling in 3D Trace Space Enables Learning from Cross-Embodiment Videos
Lee, Seungjae, Jung, Yoonkyo, Chun, Inkook, Lee, Yao-Chih, Cai, Zikui, Huang, Hongjia, Talreja, Aayush, Dao, Tan Dat, Liang, Yongyuan, Huang, Jia-Bin, Huang, Furong
–arXiv.org Artificial Intelligence
Learning new robot tasks on new platforms and in new scenes from only a handful of demonstrations remains challenging. While videos of other embodiments - humans and different robots - are abundant, differences in embodiment, camera, and environment hinder their direct use. We address the small-data problem by introducing a unifying, symbolic representation - a compact 3D "trace-space" of scene-level trajectories - that enables learning from cross-embodiment, cross-environment, and cross-task videos. We present TraceGen, a world model that predicts future motion in trace-space rather than pixel space, abstracting away appearance while retaining the geometric structure needed for manipulation. To train TraceGen at scale, we develop TraceForge, a data pipeline that transforms heterogeneous human and robot videos into consistent 3D traces, yielding a corpus of 123K videos and 1.8M observation-trace-language triplets. Pretraining on this corpus produces a transferable 3D motion prior that adapts efficiently: with just five target robot videos, TraceGen attains 80% success across four tasks while offering 50-600x faster inference than state-of-the-art video-based world models. In the more challenging case where only five uncalibrated human demonstration videos captured on a handheld phone are available, it still reaches 67.5% success on a real robot, highlighting TraceGen's ability to adapt across embodiments without relying on object detectors or heavy pixel-space generation.
arXiv.org Artificial Intelligence
Nov-27-2025
- Country:
- North America > United States (0.93)
- Genre:
- Research Report (0.82)
- Industry:
- Leisure & Entertainment > Sports (0.46)
- Government (0.46)
- Technology:
- Information Technology > Artificial Intelligence > Robots (1.00)