Can Context Bridge the Reality Gap? Sim-to-Real Transfer of Context-Aware Policies

Iannotta, Marco, Yang, Yuxuan, Stork, Johannes A., Schaffernicht, Erik, Stoyanov, Todor

arXiv.org Artificial Intelligence 

Sim-to-real transfer remains a major challenge in reinforcement learning (RL) for robotics, as policies trained in simulation often fail to generalize to the real world due to discrepancies in environment dynamics. While standard approaches typically train policies agnostic to these variations, we investigate whether sim-to-real transfer can be improved by conditioning the policy on an estimate of the dynamics parameters -- referred to as context. To this end, we integrate a context estimation module into a DR-based RL framework and systematically compare SOTA supervision strategies. We evaluate the resulting context-aware policies in both a canonical control benchmark and a real-world pushing task using a Franka Emika Panda robot. Results show that context-aware policies outperform the context-agnostic baseline across all settings, although the best supervision strategy depends on the task. Introduction Reinforcement learning (RL) has achieved significant success in developing robot controllers capable of solving complex tasks [1]. To address these limitations, physics simulation engines are widely used as a safer and more efficient alternative for policy training. Once a policy has been trained in simulation, it is transferred to the physical robot--a process known as sim-to-real transfer [2, 1, 3]. Although promising, this paradigm is hindered by the reality or sim-to-real gap, which refers to the discrepancy between the simulated and real-world environments [4, 5].