Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models

Chu, Kun, Zhao, Xufeng, Weber, Cornelius, Li, Mengdi, Wermter, Stefan

Nov-4-2023–arXiv.org Artificial Intelligence

Reinforcement Learning (RL) has shown its power in solving sequential decision-making problems in the robotic domain [1, 2], through optimizing control policies directly from trial-and-error interactions with environments. However, there are still several challenges [3], like sample inefficiency and difficulties in specifying rewards, limiting its applications to the field. Inspired by how we human beings learn skills from more knowledgeable persons such as teachers or supervisors, a potential solution for the above limitations is learning from human expert guidance, so as to inject additional information into the learning process. Human guidance has shown some benefits in terms of providing additional rewards or guidance to accelerate the learning of new tasks, including learning from human demonstrations [4, 5] and feedback [6, 7, 8, 9]. However, collecting sufficient human guidance is time-consuming and costly. Recently, Large Language Models (LLMs) have shown remarkable abilities to generate human-like responses in the textual domain [10, 11], and their applications have been explored in the robotic domain. While some approaches prompt LLMs to instruct robots in performing tasks [12, 13, 14, 15], they focus on utilizing LLMs' common-sense knowledge to give high-level advice for employing pre-trained or hard-coded low-level control policies, which requires much data collection or expert knowledge respectively. Since these works do not perform policy learning when executing tasks with LLMs, the robots' performance highly depends on the LLM's capabilities and consistent presence during the interactions each time tasks are executed.

large language model, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

Nov-4-2023

arXiv.org PDF

Add feedback

Country:
- Europe > Germany (0.14)
- North America > United States (0.14)

Genre:
- Research Report
  - New Finding (0.46)
  - Promising Solution (0.34)

Industry:
- Education (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (0.68)
    - Reinforcement Learning (1.00)
  - Natural Language > Large Language Model (1.00)
  - Robots (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found