Mobile manipulation robots are envisioned to provide many useful services both in domestic environments as well as in the industrial context. In this paper, we present novel approaches to allow mobile maniplation systems to autonomously adapt to new or changing situations. The approaches developed in this paper cover the following four topics: (1) learning the robot's kinematic structure and properties using actuation and visual feedback, (2) learning about articulated objects in the environment in which the robot is operating, (3) using tactile feedback to augment visual perception, and (4) learning novel manipulation tasks from human demonstrations.
Learning contact-rich, robotic manipulation skills is a challenging problem due to the high-dimensionality of the state and action space as well as uncertainty from noisy sensors and inaccurate motor control. To combat these factors and achieve more robust manipulation, humans actively exploit contact constraints in the environment. By adopting a similar strategy, robots can also achieve more robust manipulation. In this paper, we enable a robot to autonomously modify its environment and thereby discover how to ease manipulation skill learning. Specifically, we provide the robot with fixtures that it can freely place within the environment. These fixtures provide hard constraints that limit the outcome of robot actions. Thereby, they funnel uncertainty from perception and motor control and scaffold manipulation skill learning. We propose a learning system that consists of two learning loops. In the outer loop, the robot positions the fixture in the workspace. In the inner loop, the robot learns a manipulation skill and after a fixed number of episodes, returns the reward to the outer loop. Thereby, the robot is incentivised to place the fixture such that the inner loop quickly achieves a high reward. We demonstrate our framework both in simulation and in the real world on three tasks: peg insertion, wrench manipulation and shallow-depth insertion. We show that manipulation skill learning is dramatically sped up through this way of scaffolding.
In this work we present a method that allows to learn a cost function for motion planning of human-robot collaborative manipulation tasks where the human and the robot manipulate objects simultaneously in close proximity. Our approach is based on inverse optimal control which enables, considering a set of demonstrations, to find a cost function balancing different features. The cost function that is recovered from the human demonstrations is composed of elementary features, which are designed to encode notions such as safely, legibility and efficiency of the manipulation motions. We demonstrate the approach on data gathered from motion capture of human-human manipulation in close proximity of blocks on a table. To demonstrate the feasibility and efficacy of our approach we provide initial test results consisting of learning a cost function and then planning for the human kinematic model used in the learning phase.
We use reinforcement learning (RL) to learn dexterous in-hand manipulation policies which can perform vision-based object reorientation on a physical Shadow Dexterous Hand. The training is performed in a simulated environment in which we randomize many of the physical properties of the system like friction coefficients and an object's appearance. Our policies transfer to the physical robot despite being trained entirely in simulation. Our method does not rely on any human demonstrations, but many behaviors found in human manipulation emerge naturally, including finger gaiting, multi-finger coordination, and the controlled use of gravity. Our results were obtained using the same distributed RL system that was used to train OpenAI Five. We also include a video of our results: https://youtu.be/jwSbzNHGflM
Tool manipulation is vital for facilitating robots to complete challenging task goals. It requires reasoning about the desired effect of the task and thus properly grasping and manipulating the tool to achieve the task. Task-agnostic grasping optimizes for grasp robustness while ignoring crucial task-specific constraints. In this paper, we propose the Task-Oriented Grasping Network (TOG-Net) to jointly optimize both task-oriented grasping of a tool and the manipulation policy for that tool. The training process of the model is based on large-scale simulated self-supervision with procedurally generated tool objects. We perform both simulated and real-world experiments on two tool-based manipulation tasks: sweeping and hammering. Our model achieves overall 71.1% task success rate for sweeping and 80.0% task success rate for hammering. Supplementary material is available at: bit.ly/task-oriented-grasp