Pre-Trained Multi-Goal Transformers with Prompt Optimization for Efficient Online Adaptation

Neural Information Processing Systems 

We adopt a multi-armed bandit framework for this process, enhancing prompt selection based on the returns from online trajectories.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found