Not enough data to create a plot.
Try a different view from the menu above.
Calinon, Sylvain
A survey on policy search algorithms for learning robot controllers in a handful of trials
Chatzilygeroudis, Konstantinos, Vassiliades, Vassilis, Stulp, Freek, Calinon, Sylvain, Mouret, Jean-Baptiste
Most policy search algorithms require thousands of training episodes to find an effective policy, which is often infeasible with a physical robot. This survey article focuses on the extreme other end of the spectrum: how can a robot adapt with only a handful of trials (a dozen) and a few minutes? By analogy with the word "big-data", we refer to this challenge as "micro-data reinforcement learning". We show that a first strategy is to leverage prior knowledge on the policy structure (e.g., dynamic movement primitives), on the policy parameters (e.g., demonstrations), or on the dynamics (e.g., simulators). A second strategy is to create data-driven surrogate models of the expected reward (e.g., Bayesian optimization) or the dynamical model (e.g., model-based policy search), so that the policy optimizer queries the model instead of the real system. Overall, all successful micro-data algorithms combine these two strategies by varying the kind of model and prior knowledge. The current scientific challenges essentially revolve around scaling up to complex robots (e.g., humanoids), designing generic priors, and optimizing the computing time.
A Skill Transfer Approach for Continuum Robots โ Imitation of Octopus Reaching Motion with the STIFF-FLOP Robot
Malekzadeh, Milad S. (Istituto Italiano di Tecnologia (IIT)) | Calinon, Sylvain (Idiap Research Institute and Istituto Italiano di Tecnologia (IIT)) | Bruno, Danilo (Istituto Italiano di Tecnologia (IIT)) | Caldwell, Darwin G. (Istituto Italiano di Tecnologia (IIT))
The problem of transferring skills to hyper-redundant system requires the design of new motion primitive representations that can cope with multiple sources of noise and redundancy, and that can dynamically handle perturbations in the environment. One way is to take inspiration from invertebrate systems in nature to seek for new versatile representations of motion/behavior primitives for continuum robots. In particular, the incredibly varied skills achieved by the octopus can guide us toward the design of such robust encoding scheme. This abstract presents our ongoing work that aims at combining statistical machine learning, dynamical systems and stochastic optimization to study the problem of transferring skills to a flexible surgical robot (STIFF-FLOP) composed of 2 modules with constant curvatures. The approach is tested in simulation by imitation and self-refinement of an octopus reaching motion.
Learning Collaborative Impedance-Based Robot Behaviors
Rozo, Leonel Dario (Istituto Italiano di Tecnologia) | Calinon, Sylvain (Istituto Italiano di Tecnologia) | Caldwell, Darwin (Istituto Italiano di Tecnologia) | Jimenez, Pablo (Researcher, Institut de Robotica i Informatica Industrial) | Torras, Carme (Institut de Robotica i Informatica Industrial)
Research in learning from demonstration has focused on transferring movements from humans to robots. However, a need is arising for robots that do not just replicate the task on their own, but that also interact with humans in a safe and natural way to accomplish tasks cooperatively. Robots with variable impedance capabilities opens the door to new challenging applications, where the learning algorithms must be extended by encapsulating force and vision information. In this paper we propose a framework to transfer impedance-based behaviors to a torque-controlled robot by kinesthetic teaching. The proposed model encodes the examples as a task-parameterized statistical dynamical system, where the robot impedance is shaped by estimating virtual stiffness matrices from the set of demonstrations. A collaborative assembly task is used as testbed. The results show that the model can be used to modify the robot impedance along task execution to facilitate the collaboration, by triggering stiff and compliant behaviors in an on-line manner to adapt to the user's actions.
Bayesian Nonparametric Multi-Optima Policy Search in Reinforcement Learning
Bruno, Danilo (Istituto Italiano di Tecnologia (IIT)) | Calinon, Sylvain (Istituto Italiano di Tecnologia (IIT)) | Caldwell, Darwin G. (Istituto Italiano di Tecnologia (IIT))
Skills can often be performed in many different ways. In order to provide robots with human-like adaptation capabilities, it is of great interest to learn several ways of achieving the same skills in parallel, since eventual changes in the environment or in the robot can make some solutions unfeasible. In this case, the knowledge of multiple solutions can avoid relearning the task. This problem is addressed in this paper within the framework of Reinforcement Learning, as the automatic determination of multiple optimal parameterized policies. For this purpose, a model handling a variable number of policies is built using a Bayesian non-parametric approach. The algorithm is first compared to single policy algorithms on known benchmarks. It is then applied to a typical robotic problem presenting multiple solutions.