Hardware Conditioned Policies for Multi-Robot Transfer Learning

Chen, Tao, Murali, Adithyavairavan, Gupta, Abhinav

Feb-14-2020, 20:42:25 GMT–Neural Information Processing Systems

Deep reinforcement learning could be used to learn dexterous robotic policies but it is challenging to transfer them to new robots with vastly different hardware properties. It is also prohibitively expensive to learn a new policy from scratch for each robot hardware due to the high sample complexity of modern state-of-the-art algorithms. We propose a novel approach called Hardware Conditioned Policies where we train a universal policy conditioned on a vector representation of robot hardware. We considered robots in simulation with varied dynamics, kinematic structure, kinematic lengths and degrees-of-freedom. First, we use the kinematic structure directly as the hardware encoding and show great zero-shot transfer to completely novel robots not seen during training.

hardware conditioned policy, multi-robot transfer learning, robot hardware

Neural Information Processing Systems

Feb-14-2020, 20:42:25 GMT

Conferences Web Page

Add feedback

Genre:
- Research Report (0.43)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Machine Learning
    - Reinforcement Learning (0.63)
    - Transfer Learning (0.40)