Imitation with Neural Density Models

Kim, Kuno, Jindal, Akshat, Song, Yang, Song, Jiaming, Sui, Yanan, Ermon, Stefano

Oct-19-2020–arXiv.org Artificial Intelligence

Imitation Learning (IL) algorithms aim to learn optimal behavior by mimicking expert demonstrations. Perhaps the simplest IL method is Behavioral Cloning (BC) [Pomerleau, 1991] which ignores the dynamics of the underlying Markov Decision Process (MDP) that generated the demonstrations, and treats IL as a supervised learning problem of predicting optimal actions given states. Prior work showed that if the learned policy incurs a small BC loss, the worst case performance gap between the expert and imitator grows quadratically with the number of decision steps [Ross and Bagnell, 2010, Ross et al., 2011a]. The crux of their argument is that policies that are "close" as measured by BC loss can induce disastrously different distributions over states when deployed in the environment. One family of solutions to mitigating such compounding errors is Interactive IL [Ross et al., 2011b, 2013, Guo et al., 2014], which involves running the imitator's policy and collecting corrective actions from an interactive expert. However, interactive expert queries can be expensive and are seldom available. Another family of approaches [Ho and Ermon, 2016, Fu et al., 2017, Ke et al., 2020, Kostrikov et al., 2020, Kim and Park, 2018, Wang et al., 2017] that have gained much traction is to directly minimize a statistical distance between state-action distributions induced by policies of the expert and imitator, i.e the occupancy measures ρ

arxiv preprint arxiv, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

Oct-19-2020

arXiv.org PDF

Add feedback

Country:
- North America > Puerto Rico
  - San Juan > San Juan (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.64)

Industry:
- Leisure & Entertainment > Games > Computer Games (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (0.68)
  - Neural Networks > Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found