Reviews: Unsupervised Curricula for Visual Meta-Reinforcement Learning

Jan-27-2025, 08:39:40 GMT–Neural Information Processing Systems

This paper presents a method for learning a distribution of tasks to feed to an agent that's learning via meta RL, while simultaneously optimizing the agent to perform better more quickly on tasks sampled from this distribution. The task distribution is trained using an objective that maximizes mutual information between a latent task variable and the trajectories produced by the meta RL agent. The meta RL agent is trained to maximize this mutual information, more or less. The overall optimization relies on some variational lower bounds on mutual information, and on the RL 2 algorithm for meta RL. Experiments are provided which show that the task distributions and meta RL agents trained in this co-adaptive manner exhibit some potentially useful behaviors, e.g. an improved ability to quickly solve new tasks sampled from an "actual" task distribution -- i.e., a task distribution which is not equal to the one that's co-adapted with the agent.

agent, meta rl, task distribution, (6 more...)

Neural Information Processing Systems

Jan-27-2025, 08:39:40 GMT

Conferences Web Page

Add feedback

Genre:
- Instructional Material > Course Syllabus & Notes (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)