RobotDancing: Residual-Action Reinforcement Learning Enables Robust Long-Horizon Humanoid Motion Tracking

Sun, Zhenguo, Peng, Yibo, Meng, Yuan, Li, Xukun, Huang, Bo-Sheng, Bing, Zhenshan, Wang, Xinlong, Knoll, Alois

Sep-26-2025–arXiv.org Artificial Intelligence

Abstract-- Long-horizon, high-dynamic motion tracking on humanoids remains brittle because absolute joint commands cannot compensate model-plant mismatch, leading to error accumulation. We propose RobotDancing, a simple, scalable framework that predicts residual joint targets to explicitly correct dynamics discrepancies. The pipeline is end-to-end--training, sim-to-sim validation, and zero-shot sim-to-real--and uses a single-stage reinforcement learning (RL) setup with a unified observation, reward, and hyperparameter configuration. RobotDancing can track multi-minute, high-energy behaviors (jumps, spins, cartwheels) and deploys zero-shot to hardware with high motion tracking quality. I. INTRODUCTION Humanoid robots are increasingly expected to execute long-horizon, highly dynamic behaviors such as dance, where small tracking errors compound rapidly and destabilize control. A principal source of such drift is the mismatch between idealized reference trajectories and the robot's true physics (actuation limits, friction, inertia, latency).

arxiv preprint arxiv, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

Sep-26-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.28)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Vision > Image Understanding (0.84)
  - Machine Learning
    - Reinforcement Learning (0.85)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.68)