Reviews: Lifelong Inverse Reinforcement Learning
–Neural Information Processing Systems
Summary: This paper considers the problem of lifelong inverse reinforcement learning, where the goal is to learn a set of reward functions (from demonstrations) that can be applied to a series of tasks. The authors propose to do this by learning and continuously updating a shared latent space of reward components, which are combined with task specific coefficients to reconstruct the reward for a particular task. The derivation of the algorithm basically mirrors the Efficient Lifelong Learning Algorithm (ELLA) (citation [33]). Although ELLA was formulated for supervised learning, variants such as PG-ELLA (not cited in this paper, by Ammar et al. "Online Multi-task Learning for Policy Gradient Methods") have applied the same derivation procedure to extend the original ELLA algorithm to the reinforcement learning setting. This paper is another extension of ELLA, to the inverse reinforcement learning setting, where instead of sharing policies via a latent space, they are sharing reward functions.
Neural Information Processing Systems
Oct-7-2024, 07:24:06 GMT
- Technology: