locomote
DMAP: a Distributed Morphological Attention Policy for learning to locomote with a changing body
Biological and artificial agents need to deal with constant changes in the real world. We study this problem in four classical continuous control environments, augmented with morphological perturbations. Learning to locomote when the length and the thickness of different body parts vary is challenging, as the control policy is required to adapt to the morphology to successfully balance and advance the agent. We show that a control policy based on the proprioceptive state performs poorly with highly variable body configurations, while an (oracle) agent with access to a learned encoding of the perturbation performs significantly better. We introduce DMAP, a biologically-inspired, attention-based policy network architecture. DMAP combines independent proprioceptive processing, a distributed policy with individual controllers for each joint, and an attention mechanism, to dynamically gate sensory information from different body parts to different controllers. Despite not having access to the (hidden) morphology information, DMAP can be trained end-to-end in all the considered environments, overall matching or surpassing the performance of an oracle agent. Thus DMAP, implementing principles from biological motor control, provides a strong inductive bias for learning challenging sensorimotor tasks.
DMAP: a Distributed Morphological Attention Policy for learning to locomote with a changing body
Biological and artificial agents need to deal with constant changes in the real world. We study this problem in four classical continuous control environments, augmented with morphological perturbations. Learning to locomote when the length and the thickness of different body parts vary is challenging, as the control policy is required to adapt to the morphology to successfully balance and advance the agent. We show that a control policy based on the proprioceptive state performs poorly with highly variable body configurations, while an (oracle) agent with access to a learned encoding of the perturbation performs significantly better. We introduce DMAP, a biologically-inspired, attention-based policy network architecture.
AllGaits: Learning All Quadruped Gaits and Transitions
Bellegarda, Guillaume, Shafiee, Milad, Ijspeert, Auke
We present a framework for learning a single policy capable of producing all quadruped gaits and transitions. The framework consists of a policy trained with deep reinforcement learning (DRL) to modulate the parameters of a system of abstract oscillators (i.e. Central Pattern Generator), whose output is mapped to joint commands through a pattern formation layer that sets the gait style, i.e. body height, swing foot ground clearance height, and foot offset. Different gaits are formed by changing the coupling between different oscillators, which can be instantaneously selected at any velocity by a user. With this framework, we systematically investigate which gait should be used at which velocity, and when gait transitions should occur from a Cost of Transport (COT), i.e. energy-efficiency, point of view. Additionally, we note how gait style changes as a function of locomotion speed for each gait to keep the most energy-efficient locomotion. While the currently most popular gait (trot) does not result in the lowest COT, we find that considering different co-dependent metrics such as mean base velocity and joint acceleration result in different `optimal' gaits than those that minimize COT. We deploy our controller in various hardware experiments, showing all 9 typical quadruped animal gaits, and demonstrate generalizability to unseen gaits during training, and robustness to leg failures. Video results can be found at https://youtu.be/OLoWSX_R868.
Design of Soft, Modular Appendages for a Bio-inspired Multi-Legged Terrestrial Robot
Siddiquee, Abu Nayem Md. Asraf, Colfer, Benjamin, Ozkan-Aydin, Yasemin
Soft robots have the ability to adapt to their environment, which makes them suitable for use in disaster areas and agricultural fields, where their mobility is constrained by complex terrain. One of the main challenges in developing soft terrestrial robots is that the robot must be soft enough to adapt to its environment, but also rigid enough to exert the required force on the ground to locomote. In this paper, we report a pneumatically driven, soft modular appendage made of silicone for a terrestrial robot capable of generating specific mechanical movement to locomote and transport loads in the desired direction. This two-segmented soft appendage uses actuation in between the joint and the lower segment of the appendage to ensure adequate rigidity to exert the required force to locomote. A prototype of a soft-rigid-bodied tethered physical robot was developed and two sets of experiments were carried out in both air and underwater environments to assess its performance. The experimental results address the effectiveness of the soft appendage to generate adequate force to navigate through various environments and our design method offers a simple, low-cost, and efficient way to develop terradynamically capable soft appendages that can be used in a variety of locomotion applications.
DMAP: a Distributed Morphological Attention Policy for Learning to Locomote with a Changing Body
Chiappa, Alberto Silvio, Vargas, Alessandro Marin, Mathis, Alexander
Biological and artificial agents need to deal with constant changes in the real world. We study this problem in four classical continuous control environments, augmented with morphological perturbations. Learning to locomote when the length and the thickness of different body parts vary is challenging, as the control policy is required to adapt to the morphology to successfully balance and advance the agent. We show that a control policy based on the proprioceptive state performs poorly with highly variable body configurations, while an (oracle) agent with access to a learned encoding of the perturbation performs significantly better. We introduce DMAP, a biologically-inspired, attention-based policy network architecture. DMAP combines independent proprioceptive processing, a distributed policy with individual controllers for each joint, and an attention mechanism, to dynamically gate sensory information from different body parts to different controllers. Despite not having access to the (hidden) morphology information, DMAP can be trained end-to-end in all the considered environments, overall matching or surpassing the performance of an oracle agent. Thus DMAP, implementing principles from biological motor control, provides a strong inductive bias for learning challenging sensorimotor tasks. Overall, our work corroborates the power of these principles in challenging locomotion tasks.