Figure 1: Summary of our recommendations for when a practitioner should BC and various imitation learning style methods, and when they should use offline RL approaches. Offline reinforcement learning allows learning policies from previously collected data, which has profound implications for applying RL in domains where running trial-and-error learning is impractical or dangerous, such as safety-critical settings like autonomous driving or medical treatment planning. In such scenarios, online exploration is simply too risky, but offline RL methods can learn effective policies from logged data collected by humans or heuristically designed controllers. Prior learning-based control methods have also approached learning from existing data as imitation learning: if the data is generally "good enough," simply copying the behavior in the data can lead to good results, and if it's not good enough, then filtering or reweighting the data and then copying can work well. Several recent works suggest that this is a viable alternative to modern offline RL methods.
I believe that one of the best ways to get the training you need for a job market in robotics is to attend tutorials at conferences like ICRA. Unlike workshops where you might listen to some work-in-progress, other workshop paper presentations and panel discussions, tutorials are exactly what they sound like. They aim to give you some hands-on learning sessions on technical tools/skills with specific learning objectives. As such, most tutorials would expect you to come prepared to actively participate and follow along. For instance, the "Tools for Robotic Reinforcement Learning" tutorial expects you to come knowing how to code in python and have basic knowledge of reinforcement learning because you'll be expected to use those skills/knowledge in the hands-on sessions. There are seven tutorials this year.
In recent years, we've seen a resurgence in AI, or artificial intelligence, and machine learning. Machine learning has led to some amazing results, like being able to analyze medical images and predict diseases on-par with human experts. Google's AlphaGo program was able to beat a world champion in the strategy game go using deep reinforcement learning. Machine learning is even being used to program self driving cars, which is going to change the automotive industry forever. Imagine a world with drastically reduced car accidents, simply by removing the element of human error.
Congratulations to Pieter Abbeel who has been awarded the ACM Prize in Computing for his contribution to robot learning, including learning from demonstrations and deep reinforcement learning for robotic control. Pieter's research has covered the following: Pieter Abbeel is a Professor of Computer Science and Electrical Engineering at the University of California, Berkeley and the Co-Founder, President and Chief Scientist at Covariant, an AI robotics company. He also hosts the The Robot Brains podcast. The ACM Prize in Computing recognizes an early- to mid-career fundamental, innovative contribution in computing that, through its depth, impact and broad implications, exemplifies the greatest achievements in the discipline. The award carries a prize of $250,000.
This year's ACM Prize in Computing is going toward a machine learning specialist whose work, even if you haven't heard of him, is likely to be familiar. Pieter Abbeel, UC Berkeley professor and co-founder of AI robotics company Covariant, was awarded the prize and its $250,000 bounty, which is given to those in the machine learning field "whose research contributions have fundamental impact and broad implications." Abbeel is a professor of computer science and electrical engineering whose work has already received some recognition. Along with this new award, he was named a top young innovator under 25 by the MIT Technology Review and won a prize given out to the best US PhD thesis in robotics and automation. ACM said Abbeel was a trailblazer in apprenticeship and reinforcement learning, and highlighted a clothes-folding robot he designed that was better able to manipulate deformable objects.
Cathy Wu is the Gilbert W. Winslow Assistant Professor of Civil and Environmental Engineering and a member of the MIT Institute for Data, Systems, and Society. As an undergraduate, Wu won MIT's toughest robotics competition, and as a graduate student took the University of California at Berkeley's first-ever course on deep reinforcement learning. Now back at MIT, she's working to improve the flow of robots in Amazon warehouses under the Science Hub, a new collaboration between the tech giant and the MIT Schwarzman College of Computing. Outside of the lab and classroom, Wu can be found running, drawing, pouring lattes at home, and watching YouTube videos on math and infrastructure via 3Blue1Brown and Practical Engineering. She recently took a break from all of that to talk about her work.
Quadrupedal robots are becoming a familiar sight, but engineers are still working out the full capabilities of these machines. Now, a group of researchers from MIT says one way to improve their functionality might be to use AI to help teach the bots how to walk and run. Usually, when engineers are creating the software that controls the movement of legged robots, they write a set of rules about how the machine should respond to certain inputs. So, if a robot's sensors detect x amount of force on leg y, it will respond by powering up motor a to exert torque b, and so on. Coding these parameters is complicated and time-consuming, but it gives researchers precise and predictable control over the robots.
Simulation and synthesis are core parts of the future of AI and machine learning. Consider: programmers, data scientists, and machine learning engineers can create the brain of a self-driving car without the car. Rather than use information from the real world, you can create artificial data using simulations to train traditional machine learning models. With this practical book, you'll explore the possibilities of simulation- and synthesis-based machine learning and AI, with a focus on deep reinforcement learning and imitation learning techniques. AI and ML are increasingly data driven, and simulations are a powerful, engaging way to unlock their full potential.
The combination of deep learning and decision learning has led to several impressive stories in decision-making AI research, including AIs that can play a variety of games (Atari video games, board games, complex real-time strategy game Starcraft II), control robots (in simulation and in the real world), and even fly a weather balloon. These are examples of sequential decision tasks, in which the AI agent needs to make a sequence of decisions to achieve its goal. Today, the two main approaches for training such agents are reinforcement learning (RL) and imitation learning (IL). In reinforcement learning, humans provide rewards for completing discrete tasks, with the rewards typically being delayed and sparse. For example, 100 points are given for solving the first room of Montezuma's revenge (Fig.1). In the imitation learning setting, humans can transfer knowledge and skills through step-by-step action demonstrations (Fig.2), and the agent then learns to mimic human actions.