Wasserstein Adversarial Imitation Learning

Xiao, Huang, Herman, Michael, Wagner, Joerg, Ziesche, Sebastian, Etesami, Jalal, Linh, Thai Hong

Jun-19-2019–arXiv.org Machine Learning

Imitation Learning describes the problem of recovering an expert policy from demonstrations. While inverse reinforcement learning approaches are known to be very sample-efficient in terms of expert demonstrations, they usually require problem-dependent reward functions or a (task-)specific reward-function regularization. In this paper, we show a natural connection between inverse reinforcement learning approaches and Optimal Transport, that enables more general reward functions with desirable properties (e.g., smoothness). Based on our observation, we propose a novel approach called Wasserstein Adversarial Imitation Learning. Our approach considers the Kantorovich potentials as a reward function and further leverages regularized optimal transport to enable large-scale applications. In several robotic experiments, our approach outperforms the baselines in terms of average cumulative rewards and shows a significant improvement in sample-efficiency, by requiring just one expert demonstration.

artificial intelligence, reinforcement learning, reward function, (19 more...)

arXiv.org Machine Learning

Jun-19-2019

arXiv.org PDF

Add feedback

Country:
- Europe (1.00)
- North America > United States
  - California > San Francisco County
    - San Francisco (0.14)
  - New York > New York County
    - New York City (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.46)
  - Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found