Primal Wasserstein Imitation Learning

Dadashi, Robert, Hussenot, Léonard, Geist, Matthieu, Pietquin, Olivier

Jun-8-2020–arXiv.org Machine Learning

Reinforcement Learning (RL) has solved a number of difficult tasks whether in games [Tesauro, 1995, Mnih et al., 2015, Silver et al., 2016] or robotics [Abbeel and Ng, 2004, Andrychowicz et al., 2020]. However, RL relies on the existence of a reward function, that can be either hard to specify or too sparse to be used in practice. Imitation Learning (IL) is a paradigm that applies to these environments with hard to specify rewards: we seek to solve a task by learning a policy from a fixed number of demonstrations generated by an expert. IL methods can typically be folded into two paradigms: Behavioral Cloning [Pomerleau, 1991, Bagnell et al., 2007, Ross and Bagnell, 2010] and Inverse Reinforcement Learning [Russell, 1998, Ng et al., 2000]. In Behavioral Cloning, we seek to recover the expert's behavior by directly learning a policy that matches the expert behavior in some sense. In Inverse Reinforcement Learning (IRL), we assume that the demonstrations come from an agent that acts optimally with respect to an unknown reward function that we seek to recover, to subsequently train an agent on it. Although IRL methods introduce an intermediary problem to solve (i.e.

artificial intelligence, demonstration, reinforcement learning, (15 more...)

arXiv.org Machine Learning

Jun-8-2020

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.50)

Industry:
- Leisure & Entertainment (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found