Policy Learning with a Language Bottleneck

Srivastava, Megha, Colas, Cedric, Sadigh, Dorsa, Andreas, Jacob

May-7-2024–arXiv.org Artificial Intelligence

Modern AI systems such as self-driving cars and game-playing agents achieve superhuman performance, but often lack human-like features such as generalization, interpretability and human inter-operability. Inspired by the rich interactions between language and decision-making in humans, we introduce Policy Learning with a Language Bottleneck (PLLB), a framework enabling AI agents to generate linguistic rules that capture the strategies underlying their most rewarding behaviors. PLLB alternates between a rule generation step guided by language models, and an update step where agents learn new policies guided by rules. In a two-player communication game, a maze solving task, and two image reconstruction tasks, we show that PLLB agents are not only able to learn more interpretable and generalizable behaviors, but can also share the learned rules with human users, enabling more effective human-AI coordination.

agent, bottleneck, instruction, (17 more...)

arXiv.org Artificial Intelligence

May-7-2024

arXiv.org PDF

Add feedback

Country:
- Asia > Japan (0.04)
- North America > United States
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - Illinois > Cook County
    - Chicago (0.04)
  - California > Santa Clara County
    - Palo Alto (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report
  - Experimental Study (0.46)
  - New Finding (0.46)

Industry:
- Health & Medicine > Therapeutic Area (0.47)
- Information Technology (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Natural Language (1.00)
  - Machine Learning > Reinforcement Learning (0.95)
  - Representation & Reasoning
    - Agents (0.88)
    - Rule-Based Reasoning (0.87)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found