Curriculum Reinforcement Learning using Optimal Transport via Gradual Domain Adaptation

Oct-10-2024, 21:09:44 GMT–Neural Information Processing Systems

Curriculum Reinforcement Learning (CRL) aims to create a sequence of tasks, starting from easy ones and gradually learning towards difficult tasks. In this work, we focus on the idea of framing CRL as interpolations between a source (auxiliary) and a target task distribution. Although existing studies have shown the great potential of this idea, it remains unclear how to formally quantify and generate the movement between task distributions. Inspired by the insights from gradual domain adaptation in semi-supervised learning, we create a natural curriculum by breaking down the potentially large task distributional shift in CRL into smaller shifts. We propose GRADIENT which formulates CRL as an optimal transport problem with a tailored distance metric between tasks.

curriculum reinforcement learning, gradual domain adaptation, task distribution, (4 more...)

Neural Information Processing Systems

Oct-10-2024, 21:09:44 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.64)