Interaction-limited Inverse Reinforcement Learning
Troussard, Martin, Pignat, Emmanuel, Kamalaruban, Parameswaran, Calinon, Sylvain, Cevher, Volkan
Learning from Demonstrations (LfD) is an active research area that addresses the problem of learning how to perform a task by observing the demonstrations provided by an expert. This approach plays an important role in many real-life learning settings, including human-to-robot interaction [1, 2, 3, 4, 5]. The two popular approaches for LfD include (i) behavioral cloning, which directly mimics the expert behavior, without understanding the objective [6], and (ii) inverse reinforcement learning (IRL), which infers the reward function (i.e., the objective of the task) explaining the expert behavior [7]. In this work, we focus on the IRL approach to LfD. Typically, the IRL learner assumes that the demonstrated expert behavior is optimal with respect to some reward function, even if the reward function cannot be specified explicitly as in typical reinforcement learning (RL).
Jul-1-2020
- Country:
- North America > United States
- New Jersey > Hudson County > Secaucus (0.04)
- Asia > Vietnam
- North America > United States
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Education (1.00)
- Technology: