Compositional Instruction Following with Language Models and Reinforcement Learning

Cohen, Vanya, Tasse, Geraud Nangue, Gopalan, Nakul, James, Steven, Gombolay, Matthew, Mooney, Ray, Rosman, Benjamin

Jan-21-2025–arXiv.org Artificial Intelligence

Combining reinforcement learning with language grounding is challenging as the agent needs to explore the environment while simultaneously learning multiple language-conditioned tasks. To address this, we introduce a novel method: the compositionally-enabled reinforcement learning language agent (CERLLA). Our method reduces the sample complexity of tasks specified with language by leveraging compositional policy representations and a semantic parser trained using reinforcement learning and in-context learning. We evaluate our approach in an environment requiring function approximation and demonstrate compositional generalization to novel tasks. Our method significantly outperforms the previous best non-compositional baseline in terms of sample complexity on 162 tasks designed to test compositional generalization. Our model attains a higher success rate and learns in fewer steps than the non-compositional baseline. It reaches a success rate equal to an oracle policy's upper-bound performance of 92%. With the same number of environment steps, the baseline only reaches a success rate of 80%.

machine learning, natural language, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

Jan-21-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Arizona (0.04)
  - Texas > Travis County
    - Austin (0.04)
  - Minnesota > Hennepin County
    - Minneapolis (0.14)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report
  - New Finding (0.46)
  - Promising Solution (0.34)

Industry:
- Government > Military (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found