SPRINT: Scalable Policy Pre-Training via Language Instruction Relabeling

Zhang, Jesse, Pertsch, Karl, Zhang, Jiahui, Lim, Joseph J.

Jan-29-2024–arXiv.org Artificial Intelligence

Pre-training robot policies with a rich set of skills can substantially accelerate the learning of downstream tasks. Prior works have defined pre-training tasks via natural language instructions, but doing so requires tedious human annotation of hundreds of thousands of instructions. Thus, we propose SPRINT, a scalable offline policy pre-training approach which substantially reduces the human effort needed for pre-training a diverse set of skills. Our method uses two core ideas to automatically expand a base set of pre-training tasks: instruction relabeling via large language models and cross-trajectory skill chaining through offline reinforcement learning. As a result, SPRINT pre-training equips robots with a much richer repertoire of skills. Experimental results in a household simulator and on a real robot kitchen manipulation task show that SPRINT leads to substantially faster learning of new long-horizon tasks than previous pre-training approaches. Website at https://clvrai.com/sprint.

large language model, machine learning, trajectory, (21 more...)

arXiv.org Artificial Intelligence

Jan-29-2024

arXiv.org PDF

Add feedback

Country:
- Europe > France (0.14)
- North America > United States
  - California (0.14)

Genre:
- Research Report (0.50)

Industry:
- Education (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (0.46)
    - Reinforcement Learning (1.00)
  - Natural Language > Large Language Model (1.00)