Pre-Trained Multi-Goal Transformers with Prompt Optimization for Efficient Online Adaptation Haoqi Yuan Zongqing Lu
–Neural Information Processing Systems
Efficiently solving unseen tasks remains a challenge in reinforcement learning (RL), especially for long-horizon tasks composed of multiple subtasks. Pre-training policies from task-agnostic datasets has emerged as a promising approach, yet existing methods still necessitate substantial interactions via RL to learn new tasks. We introduce MGPO, a method that leverages the power of Transformer-based policies to model sequences of goals, enabling efficient online adaptation through prompt optimization.
Neural Information Processing Systems
May-29-2025, 18:59:21 GMT
- Genre:
- Research Report
- Experimental Study (0.93)
- New Finding (1.00)
- Research Report
- Industry:
- Education (0.46)
- Information Technology (0.46)
- Leisure & Entertainment (0.46)
- Technology: