Goto

Collaborating Authors

 schaal


NeuralDynamicPolicies forEnd-to-EndSensorimotorLearning

Neural Information Processing Systems

The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces such as torque, joint angle, or end-effector position. This forces the agent to make decision at each point in training, and hence, limit the scalability to continuous, high-dimensional,andlong-horizontasks.Incontrast,researchinclassicalrobotics has, for a long time, exploited dynamical systems as a policy representation to learn robot behaviors via demonstrations.


Combining Movement Primitives with Contraction Theory

Nah, Moses C., Lachner, Johannes, Hogan, Neville, Slotine, Jean-Jacques

arXiv.org Artificial Intelligence

This paper presents a modular framework for motion planning using movement primitives. Central to the approach is Contraction Theory, a modular stability tool for nonlinear dynamical systems. The approach extends prior methods by achieving parallel and sequential combinations of both discrete and rhythmic movements, while enabling independent modulation of each movement. This modular framework enables a divide-and-conquer strategy to simplify the programming of complex robot motion planning. Simulation examples illustrate the flexibility and versatility of the framework, highlighting its potential to address diverse challenges in robot motion planning.


USC and Max Planck: The Double Life of a Top Robotics Researcher

Der Spiegel International

When Stefan Kai Schaal decided to earn more money in the future, he took a leave of absence. It took the researcher more than two years to integrate his new German life seamlessly and inconspicuously into his old American life. Schaal's employer, the University of Southern California (USC) in Los Angeles, was accommodating. It granted the renowned computer scientist the sabbatical in the middle of the semester - a sabbatical he had applied for on the day he was thrown out of his home and his wife filed for divorce after nine years of marriage. That was six years ago.


Learning to Select and Generalize Striking Movements in Robot Table Tennis

Muelling, Katharina (Max Planck Institute for Intelligent Systems) | Kober, Jens (Max Planck Institute for Intelligent Systems) | Kroemer, Oliver (Technische Universitaet Darmstadt) | Peters, Jan (Technische Universitaet Darmstadt)

AAAI Conferences

Learning new motor tasks autonomously from interaction with a human being is an important goal for both robotics and machine learning. However, when moving beyond basic skills, most monolithic machine learning approaches fail to scale. In this paper, we take the task of learning table tennis as an example and present a new framework which allows a robot to learn cooperative table tennis from interaction with a human. Therefore, the robot first learns a set of elementary table tennis hitting movements from a human teacher by kinesthetic teach-in, which is compiled into a set of dynamical system motor primitives (DMPs). Subsequently, the system generalizes these movements to a wider range of situations using our mixture of motor primitives (MoMP) approach. The resulting policy enables the robot to select appropriate motor primitives as well as to generalize between them. Finally, the robot plays with a human table tennis partner and learns online to improve its behavior.


Training Wheels for the Robot: Learning from Demonstration Using Simulation

Koenig, Nathan (Open Source Robotics Foundation) | Mataric' (University of Southern California) | , Maja

AAAI Conferences

Learning from demonstration (LfD) is a promising technique for instructing/teaching autonomous systems based on demonstrations from people who may have little to no experience with robots. An important aspect to LfD is the communication method used to transfer knowledge from an instructor to a robot. The communication method affects the complexity of the demonstration process for instructors, the range of tasks a robot can learn, and the learning algorithm itself. We have designed a graphical interface and an instructional language to provide an intuitive teaching system. The drawback to simplifying the teaching interface is that the resulting demonstration data are less structured, adding complexity to the learning process. This additional complexity is handled through the combination of a minimal set of predefined behaviors and a task representation capable of learning probabilistic policies over a set of behaviors. The predefined behaviors consist of finite actions a robot can perform, which act as building blocks for more complex tasks.


Reinforcement Learning to Adjust Robot Movements to New Situations

Kober, Jens (Max Planck Institute for Intelligent Systems) | Oztop, Erhan (Advanced Telecommunications Research Institute) | Peters, Jan (Max Planck Institute for Intelligent Systems)

AAAI Conferences

Many complex robot motor skills can be represented using elementary movements, and there exist efficient techniques for learning parametrized motor plans using demonstrations and self-improvement. However with current techniques, in many cases, the robot currently needs to learn a new elementary movement even if a parametrized motor plan exists that covers a related situation. A method is needed that modulates the elementary movement through the meta-parameters of its representation. In this paper, we describe how to learn such mappings from circumstances to meta-parameters using reinforcement learning. In particular we use a kernelized version of the reward-weighted regression. We show two robot applications of the presented setup in robotic domains; the generalization of throwing movements in darts, and of hitting movements in table tennis. We demonstrate that both tasks can be learned successfully using simulated and real robots.



Nonparametric Model-Based Reinforcement Learning

Atkeson, Christopher G.

Neural Information Processing Systems

This paper describes some of the interactions of model learning algorithms and planning algorithms we have found in exploring model-based reinforcement learning. The paper focuses on how local trajectory optimizers can be used effectively with learned nonparametric models. We find that trajectory planners that are fully consistent with the learned model often have difficulty finding reasonable plans in the early stages of learning. Trajectory planners that balance obeying the learned model with minimizing cost (or maximizing reward) often do better, even if the plan is not fully consistent with the learned model.



Nonparametric Model-Based Reinforcement Learning

Atkeson, Christopher G.

Neural Information Processing Systems

This paper describes some of the interactions of model learning algorithms and planning algorithms we have found in exploring model-based reinforcement learning. The paper focuses on how local trajectory optimizers can be used effectively with learned nonparametric models. We find that trajectory planners that are fully consistent with the learned model often have difficulty finding reasonable plans in the early stages of learning. Trajectory planners that balance obeying the learned model with minimizing cost (or maximizing reward) often do better, even if the plan is not fully consistent with the learned model.