Meta Reinforcement Learning with Successor Feature Based Context
–arXiv.org Artificial Intelligence
Most reinforcement learning (RL) methods only focus on learning a single task from scratch and are not able to use prior knowledge to learn other tasks more effectively. Context-based meta RL techniques are recently proposed as a possible solution to tackle this. However, they are usually less efficient than conventional RL and may require many trial-and-errors during training. To address this, we propose a novel meta-RL approach that achieves competitive performance comparing to existing meta-RL algorithms, while requires significantly fewer environmental interactions. By combining context variables with the idea of decomposing reward in successor feature framework, our method does not only learn high-quality policies for multiple tasks simultaneously but also can quickly adapt to new tasks with a small amount of training. Compared with state-of-the-art meta-RL baselines, we empirically show the effectiveness and data efficiency of our method on several continuous control tasks.
arXiv.org Artificial Intelligence
Jul-29-2022
- Country:
- Asia > China (0.04)
- Europe > Germany
- North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- Genre:
- Research Report (1.00)
- Technology: