REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments

Sridhar, Kaustubh, Dutta, Souradeep, Jayaraman, Dinesh, Lee, Insup

Dec-5-2024–arXiv.org Artificial Intelligence

Building generalist agents that can rapidly adapt to new environments is a key challenge for deploying AI in the digital and real worlds. Is scaling current agent architectures the most effective way to build generalist agents? We propose a novel approach to pre-train relatively small policies on relatively small datasets and adapt them to unseen environments via in-context learning, without any finetuning. Our key idea is that retrieval offers a powerful bias for fast adaptation. Indeed, we demonstrate that even a simple retrieval-based 1-nearest neighbor agent offers a surprisingly strong baseline for today's state-of-the-art generalist agents. From this starting point, we construct a semi-parametric agent, REGENT, that trains a transformer-based policy on sequences of queries and retrieved neighbors. REGENT can generalize to unseen robotics and game-playing environments via retrieval augmentation and in-context learning, achieving this with up to 3x fewer parameters and up to an order-of-magnitude fewer pre-training datapoints, significantly outperforming today's state-of-the-art generalist agents. AI agents, both in the digital [38, 19, 37, 28, 53] and real world [5, 7, 63, 33, 48, 24], constantly face changing environments that require rapid or even instantaneous adaptation. True generalist agents must not only be capable of performing well on large numbers of training environments, but arguably more importantly, they must be capable of adapting rapidly to new environments. While this goal has been of considerable interest to the reinforcement learning research community, it has proven elusive. The most promising results so far have all been attributed to large policies [38, 19, 37, 28, 5], pre-trained on large datasets across many environments, and even these models still struggle to generalize to unseen environments without many new environment-specific demonstrations. In this work, we take a different approach to the problem of constructing such generalist agents. We start by asking: Is scaling current agent architectures the most effective way to build generalist agents? Observing that retrieval offers a powerful bias for fast adaptation, we first evaluate a simple 1-nearest neighbor method: "Retrieve and Play (R&P)". To determine the action at the current state, R&P simply retrieves the closest state from a few demonstrations in the target environment and plays its corresponding action. Tested on a wide range of environments, both robotics and game-playing, R&P performs on-par or better than the state-of-the-art generalist agents.

large language model, machine learning, regent, (19 more...)

arXiv.org Artificial Intelligence

Dec-5-2024

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East
  - Jordan (0.04)
- Europe > Italy
  - Lombardy > Milan (0.04)
- North America
  - Canada > British Columbia (0.04)
  - United States > Pennsylvania (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Government > Regional Government
  - North America Government > United States Government (0.46)
- Leisure & Entertainment (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Reinforcement Learning (0.88)
  - Natural Language > Large Language Model (1.00)
  - Representation & Reasoning > Agents (1.00)
  - Robots (1.00)