Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement

Oct-11-2024, 01:23:21 GMT–Neural Information Processing Systems

Multi-task reinforcement learning (RL) aims to simultaneously learn policies for solving many tasks. Several prior works have found that relabeling past experience with different reward functions can improve sample efficiency. Relabeling methods typically pose the question: if, in hindsight, we assume that our experience was optimal for some task, for what task was it optimal? In this paper we show that inverse RL is a principled mechanism for reusing experience across tasks. We use this idea to generalize goal-relabeling techniques from prior work to arbitrary types of reward functions.

hindsight inference, policy improvement, rewriting history, (3 more...)

Neural Information Processing Systems

Oct-11-2024, 01:23:21 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.30)