Hyper-Decision Transformer for Efficient Online Policy Adaptation

Xu, Mengdi, Lu, Yuchen, Shen, Yikang, Zhang, Shun, Zhao, Ding, Gan, Chuang

Apr-17-2023–arXiv.org Artificial Intelligence

Decision Transformers (DT) have demonstrated strong performances in offline reinforcement learning settings, but quickly adapting to unseen novel tasks remains challenging. To address this challenge, we propose a new framework, called Hyper-Decision Transformer (HDT), that can generalize to novel tasks from a handful of demonstrations in a data-and parameter-efficient manner. To achieve such a goal, we propose to augment the base DT with an adaptation module, whose parameters are initialized by a hyper-network. When encountering unseen tasks, the hyper-network takes a handful of demonstrations as inputs and initializes the adaptation module accordingly. This initialization enables HDT to efficiently adapt to novel tasks by only fine-tuning the adaptation module. We validate HDT's generalization capability on object manipulation tasks. We find that with a single expert demonstration and fine-tuning only 0.5% of DT parameters, HDT adapts faster to unseen tasks than fine-tuning the whole DT model. Finally, we explore a more challenging setting where expert actions are not available, and we show that HDT outperforms state-of-the-art baselines in terms of task success rates by a large margin. Demos are available on our project page. Building an autonomous agent capable of generalizing to novel tasks has been a longstanding goal of artificial intelligence. Recently, large transformer models have shown strong generalization capability on language understanding when fine-tuned with limited data (Brown et al., 2020; Wei et al., 2021). Such success motivates researchers to apply transformer models to the regime of offline reinforcement learning (RL) (Chen et al., 2021; Janner et al., 2021).

large language model, machine learning, task index, (20 more...)

arXiv.org Artificial Intelligence

Apr-17-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States (0.14)
  - Canada > Quebec
    - Montreal (0.04)
- Europe > Romania
  - Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre:
- Research Report (0.82)

Industry:
- Education (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.68)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found