Hand-Eye Autonomous Delivery: Learning Humanoid Navigation, Locomotion and Reaching

Chen, Sirui, Ye, Yufei, Cao, Zi-Ang, Lew, Jennifer, Xu, Pei, Liu, C. Karen

Aug-11-2025–arXiv.org Artificial Intelligence

We propose Hand-Eye Autonomous Delivery (HEAD), a framework that learns navigation, locomotion, and reaching skills for humanoids, directly from human motion and vision perception data. We take a modular approach where the high-level planner commands the target position and orientation of the hands and eyes of the humanoid, delivered by the low-level policy that controls the whole-body movements. Specifically, the low-level whole-body controller learns to track the three points (eyes, left hand, and right hand) from existing large-scale human motion capture data while high-level policy learns from human data collected by Aria glasses. Our modular approach decouples the ego-centric vision perception from physical actions, promoting efficient learning and scalability to novel scenes. We evaluate our method both in simulation and in the real-world, demonstrating humanoid's capabilities to navigate and reach in complex environments designed for humans.

artificial intelligence, machine learning, robot, (17 more...)

arXiv.org Artificial Intelligence

Aug-11-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.30)

Genre:
- Research Report (0.40)

Technology:
- Information Technology
  - Human Computer Interaction > Interfaces
    - Virtual Reality (0.68)
  - Artificial Intelligence
    - Vision (1.00)
    - Machine Learning (1.00)
    - Robots
      - Autonomous Vehicles (0.70)
      - Locomotion (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found