Meta-Evolve: Continuous Robot Evolution for One-to-many Policy Transfer

May-6-2024–arXiv.org Artificial Intelligence

Therefore, to transfer a policy on the source robot to multiple target robots, they must launch multiple independent runs for each target robot. We investigate the problem of transferring an expert policy from a source robot to multiple different robots. To solve this problem, we propose a method named Meta-Evolve that uses continuous robot evolution to efficiently transfer the policy to each target robot through a set of tree-structured evolutionary robot sequences. The robot evolution tree allows the robot evolution paths to be shared, so our approach can significantly outperform naive one-to-one policy transfer. We present a heuristic approach to determine an optimized robot evolution tree. Experiments have shown that our method is able to improve the efficiency of one-to-three transfer of manipulation policy by up to 3.2 and one-to-six transfer of agile locomotion policy by 2.4 in terms of simulation cost over the baseline of launching multiple independent one-to-one policy transfers. The robotics industry has designed and developed a large number of commercial robots deployed in various applications. How to efficiently learn robotic skills on diverse robots in a scalable fashion? A popular solution is to train a policy for every new robot on every new task from scratch. This is not only inefficient in terms of sample efficiency but also impractical for complex robots due to a large exploration space. Inter-robot imitation by statistic matching methods that optimize to match the distribution of actions (Ross et al., 2011), transitioned states (Liu et al., 2019; Radosavovic et al., 2020), or reward (Ng et al., 2000; Ho & Ermon, 2016) could be possible solutions. However, they can only be applied to robots with similar dynamics to yield optimal performance. Recent advances in evolution-based imitation learning (Liu et al., 2022a;b) inspire us to view this problem from the perspective of policy transferring from one robot to another. The core idea is to interpolate two different robots by producing a large number of intermediate robots between them which gradually evolve from the source robot toward the target robot.

policy transfer, robot, target robot, (16 more...)

arXiv.org Artificial Intelligence

May-6-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (0.67)
    - Reinforcement Learning (0.69)
  - Robots (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found