Interaction-limited Inverse Reinforcement Learning

Troussard, Martin, Pignat, Emmanuel, Kamalaruban, Parameswaran, Calinon, Sylvain, Cevher, Volkan

arXiv.org Machine Learning 

Learning from Demonstrations (LfD) is an active research area that addresses the problem of learning how to perform a task by observing the demonstrations provided by an expert. This approach plays an important role in many real-life learning settings, including human-to-robot interaction [1, 2, 3, 4, 5]. The two popular approaches for LfD include (i) behavioral cloning, which directly mimics the expert behavior, without understanding the objective [6], and (ii) inverse reinforcement learning (IRL), which infers the reward function (i.e., the objective of the task) explaining the expert behavior [7]. In this work, we focus on the IRL approach to LfD. Typically, the IRL learner assumes that the demonstrated expert behavior is optimal with respect to some reward function, even if the reward function cannot be specified explicitly as in typical reinforcement learning (RL).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found