Legged robots pose one of the greatest challenges in robotics. Dynamic and agile maneuvers of animals cannot be imitated by existing methods that are crafted by humans. A compelling alternative is reinforcement learning, which requires minimal craftsmanship and promotes the natural evolution of a control policy. However, so far, reinforcement learning research for legged robots is mainly limited to simulation, and only few and comparably simple examples have been deployed on real systems. The primary reason is that training with real robots, particularly with dynamically balancing systems, is complicated and expensive. In the present work, we introduce a method for training a neural network policy in simulation and transferring it to a state-of-the-art legged system, thereby leveraging fast, automated, and cost-effective data generation schemes. The approach is applied to the ANYmal robot, a sophisticated medium-dog-sized quadrupedal system. Using policies trained in simulation, the quadrupedal machine achieves locomotion skills that go beyond what had been achieved with prior methods: ANYmal is capable of precisely and energy-efficiently following high-level body velocity commands, running faster than before, and recovering from falling even in complex configurations.
You are free to share this article under the Attribution 4.0 International license. Machine learning powers a new kind of drone flight controller software, researchers report. After Wil Koch flew a friend's drone for the first time, operating it through "first-person view" where a person wears a headset connected to a video feed streaming live from a camera on the drone, he thought it was amazing. So amazing that he went out that same day and purchased his own system--a video headset, controller, and quadcopter drone, named for the four propellers that power it. "You put the goggles on and they allow you to see live video transmitting from a camera mount on the drone," Koch says.
Designing agile locomotion for quadruped robots often requires extensive expertise and tedious manual tuning. In this paper, we present a system to automate this process by leveraging deep reinforcement learning techniques. Our system can learn quadruped locomotion from scratch using simple reward signals. In addition, users can provide an open loop reference to guide the learning process when more control over the learned gait is needed. The control policies are learned in a physics simulator and then deployed on real robots. In robotics, policies trained in simulation often do not transfer to the real world. We narrow this reality gap by improving the physics simulator and learning robust policies. We improve the simulation using system identification, developing an accurate actuator model and simulating latency. We learn robust controllers by randomizing the physical environments, adding perturbations and designing a compact observation space. We evaluate our system on two agile locomotion gaits: trotting and galloping. After learning in simulation, a quadruped robot can successfully perform both gaits in the real world.
Control algorithms offer the promise of simplifying a dynamic system's apparent behavior as perceived by an intelligent agent, thus making the agent's task much easier. However, the coupled dynamics of such a hybrid system can be difficult to predict and may lead to undesirable behavior. We demonstrate that it is possible for a rational intelligent agent acting on a well-controlled dynamical system to cause undesirable behavior when coupled, and present a method for analyzing the resulting dynamics of such coupled, hybrid systems. A technique for alleviating these behaviors using newly developed control algorithms is then suggested. These controllers, which are adaptive in nature, also suggesthe possibility of "distributing" learning and intelligence between the high and low levels of authority.