Meta-Reinforcement Learning with Universal Policy Adaptation: Provable Near-Optimality under All-task Optimum Comparator

Neural Information Processing Systems 

Beyond existing meta-RL analyses, we provide upper bounds of the expected optimality gap over the task distribution.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found