AMAGO-2: Breaking the Multi-Task Barrier in Meta-Reinforcement Learning with Transformers

Grigsby, Jake, Sasek, Justin, Parajuli, Samyak, Adebi, Daniel, Zhang, Amy, Zhu, Yuke

Nov-17-2024–arXiv.org Artificial Intelligence

Language models trained on diverse datasets unlock generalization by in-context learning. Reinforcement Learning (RL) policies can achieve a similar effect by meta-learning within the memory of a sequence model. However, meta-RL research primarily focuses on adapting to minor variations of a single task. It is difficult to scale towards more general behavior without confronting challenges in multi-task optimization, and few solutions are compatible with meta-RL's goal of learning from large training sets of unlabeled tasks. To address this challenge, we revisit the idea that multi-task RL is bottlenecked by imbalanced training losses created by uneven return scales across different tasks. We build upon recent advancements in Transformer-based (in-context) meta-RL and evaluate a simple yet scalable solution where both an agent's actor and critic objectives are converted to classification terms that decouple optimization from the current scale of returns. Large-scale comparisons in Meta-World ML45, Multi-Game Procgen, Multi-Task POPGym, Multi-Game Atari, and BabyAI find that this design unlocks significant progress in online multi-task adaptation and memory problems without explicit task labels.

large language model, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

Nov-17-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Texas > Travis County
    - Austin (0.04)
  - Minnesota > Hennepin County
    - Minneapolis (0.14)
  - Louisiana > Orleans Parish
    - New Orleans (0.04)
- Europe > United Kingdom
  - England > Oxfordshire > Oxford (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Leisure & Entertainment > Games (1.00)
- Education (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks > Deep Learning (0.87)