VAE-Loco: Versatile Quadruped Locomotion by Learning a Disentangled Gait Representation

Mitchell, Alexander L., Merkt, Wolfgang, Geisert, Mathieu, Gangapurwala, Siddhant, Engelcke, Martin, Jones, Oiwi Parker, Havoutis, Ioannis, Posner, Ingmar

arXiv.org Artificial Intelligence 

Abstract--Quadruped locomotion is rapidly maturing to a degree where robots are able to realise highly dynamic manoeuvres. However, current planners are unable to vary key gait parameters of the in-swing feet midair. In this work we address this limitation and show that it is pivotal in increasing controller robustness by learning a latent space capturing the key stance phases constituting a particular gait. This is achieved via a generative model trained on a single trot style, which encourages disentanglement such that application of a drive signal to a single dimension of the latent state induces holistic plans synthesising a continuous variety of trot styles. We demonstrate that specific properties of the drive signal map directly to gait parameters such as cadence, footstep height and full stance duration. Due to the nature of our approach these synthesised gaits are continuously variable online during robot operation. The use of a generative model facilitates the detection and mitigation of disturbances to provide a versatile and robust planning framework. We evaluate our approach on two versions of the real ANYmal quadruped robots and demonstrate that our method achieves a continuous blend of dynamic trot styles whilst being robust and reactive to external perturbations. Figure 1: Using a variational auto-encoder (VAE), our approach learns a structured latent space capturing key stance phases constituting a particular gait. The space is disentangled to I. I This approach by advances in optimisation-based [1]-[5] and reinforcement allows for precise base twist control and readily transfers from learning-based methods [6]-[8], quadrupeds are now able to ANYmal B to ANYmal C, a dynamically dissimilar robot, robustly plan and perform dynamic manoeuvres, making them without retraining. Additionally, we measure disturbances as an increasingly popular and reliable choice for tasks such out of distribution seen during training and adjust cadence as as inspection, monitoring, search and rescue or goods delivery a rudimentary, but effective response. However, despite recent advances, important limitations remain. Due to the complexity of the system, models used for gait planning and control are often overly contact schedules [1], [9]. Mathieu Geisert is with Agility Robotics, U.S.A. Work done while at Martin Engelcke is with DeepMind Technologies Ltd., London, U.K. Work done while at Oxford. Personal use of this material is permitted. These are often characterise and react to external perturbations. A large impulse computationally expensive [3], [4] meaning that varying the applied to the robot's base triggers a spike in the gait parameters is not achievable in real time. A limitation Evidence Lower Bound (ELBO) which clearly identifies the of all these methods is that they are unable to adjust key disturbance as out of the distribution seen during training.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found