Contrastive Representation Learning for Robust Sim-to-Real Transfer of Adaptive Humanoid Locomotion
Lu, Yidan, Yang, Rurui, Kou, Qiran, Chen, Mengting, Fan, Tao, Cui, Peter, Dong, Yinzhao, Lu, Peng
–arXiv.org Artificial Intelligence
Abstract-- Reinforcement learning has produced remarkable advances in humanoid locomotion, yet a fundamental dilemma persists for real-world deployment: policies must choose between the robustness of reactive proprioceptive control or the proactivity of complex, fragile perception-driven systems. Our core contribution is a contrastive learning framework that compels the actor's latent state to encode privileged environmental information from simulation. Crucially, this "distilled awareness" empowers an adaptive gait clock, allowing the policy to proactively adjust its rhythm based on an inferred understanding of the terrain. This synergy resolves the classic trade-off between rigid, clocked gaits and unstable clock-free policies. I. INTRODUCTION Achieving stable and adaptive locomotion in unstructured environments is a grand challenge for humanoid robotics. While Deep Reinforcement Learning (DRL) has become a cornerstone for synthesizing such behaviors, a fundamental information gap complicates real-world deployment.
arXiv.org Artificial Intelligence
Sep-17-2025