PhysHSI: Towards a Real-World Generalizable and Natural Humanoid-Scene Interaction System

Wang, Huayi, Zhang, Wentao, Yu, Runyi, Huang, Tao, Ren, Junli, Jia, Feiyu, Wang, Zirui, Niu, Xiaojie, Chen, Xiao, Chen, Jiahe, Chen, Qifeng, Wang, Jingbo, Pang, Jiangmiao

arXiv.org Artificial Intelligence 

PhysHSI can also learn (e) stylized locomotion, such as dinosaur-like walking and high-knee stepping. Abstract-- Deploying humanoid robots to interact with real-world environments--such as carrying objects or sitting on chairs--requires generalizable, lifelike motions and robust scene perception. Although prior approaches have advanced each capability individually, combining them in a unified system is still an ongoing challenge. In this work, we present a physical-world humanoid-scene interaction system, PhysHSI, that enables humanoids to autonomously perform diverse interaction tasks while maintaining natural and lifelike behaviors. PhysHSI comprises a simulation training pipeline and a real-world deployment system. In simulation, we adopt adversarial motion prior-based policy learning to imitate natural humanoid-scene interaction data across diverse scenarios, achieving both generalization and lifelike behaviors. For real-world deployment, we introduce a coarse-to-fine object localization module that combines LiDAR and camera inputs to provide continuous and robust scene perception. Imagine deploying humanoid robots in everyday environments--carrying boxes into diverse places or sitting naturally on a chair. Building such a humanoid-scene interaction (HSI) system is considered more sophisticated than executing whole-body skills such as standing up [1, 2], dancing [3, 4], or performing agile motions [5-7].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found