DiffSim2Real: Deploying Quadrupedal Locomotion Policies Purely Trained in Differentiable Simulation

Bagajo, Joshua, Schwarke, Clemens, Klemm, Victor, Georgiev, Ignat, Sleiman, Jean-Pierre, Tordesillas, Jesus, Garg, Animesh, Hutter, Marco

Nov-4-2024–arXiv.org Artificial Intelligence

Abstract-- Differentiable simulators provide analytic gradients, enabling more sample-efficient learning algorithms and paving the way for data intensive learning tasks such as learning from images. In this work, we demonstrate that locomotion policies trained with analytic gradients from a differentiable simulator can be successfully transferred to the real world. Typically, simulators that offer informative gradients lack the physical accuracy needed for sim-to-real transfer, and viceversa. A key factor in our success is a smooth contact model that combines informative gradients with physical accuracy, ensuring effective transfer of learned behaviors. To the best of our knowledge, this is the first time a real quadrupedal robot is able to locomote after training exclusively in a differentiable simulation. The majority of Reinforcement Learning (RL) algorithms rely on Zeroth-order Gradient (ZoG) estimates during optimization, allowing the use of conventional physics simulators that are typically non-differentiable.

machine learning, reinforcement learning, simulation, (16 more...)

arXiv.org Artificial Intelligence

Nov-4-2024

arXiv.org PDF

Add feedback

Country:
- Europe > Switzerland (0.29)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (0.69)
  - Robots (1.00)