Pre-Trained Multi-Goal Transformers with Prompt Optimization for Efficient Online Adaptation Haoqi Yuan Zongqing Lu

May-29-2025, 18:59:21 GMT–Neural Information Processing Systems

Efficiently solving unseen tasks remains a challenge in reinforcement learning (RL), especially for long-horizon tasks composed of multiple subtasks. Pre-training policies from task-agnostic datasets has emerged as a promising approach, yet existing methods still necessitate substantial interactions via RL to learn new tasks. We introduce MGPO, a method that leverages the power of Transformer-based policies to model sequences of goals, enabling efficient online adaptation through prompt optimization.

large language model, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

May-29-2025, 18:59:21 GMT

Conferences PDF

Add feedback

Genre:
- Research Report
  - Experimental Study (0.93)
  - New Finding (1.00)

Industry:
- Education (0.46)
- Information Technology (0.46)
- Leisure & Entertainment (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.46)
    - Neural Networks > Deep Learning (0.88)
    - Reinforcement Learning (0.89)
  - Natural Language > Large Language Model (1.00)
  - Representation & Reasoning > Optimization (1.00)