On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator

Cai, Qi, Hong, Mingyi, Chen, Yongxin, Wang, Zhaoran

Jan-11-2019–arXiv.org Machine Learning

Imitation learning is a paradigm that learns from expert demonstration to perform a task. The most straightforward approach of imitation learning is behavioral cloning (Pomerleau, 1991), which learns from expert trajectories to predict the expert action at any state. Despite its simplicity, behavioral cloning ignores the accumulation of prediction error over time. Consequently, although the learned policy closely resembles the expert policy at a given point in time, their trajectories may diverge in the long term. To remedy the issue of error accumulation, inverse reinforcement learning(Russell, 1998; Ng and Russell, 2000; Abbeel and Ng, 2004; Ratliff et al., 2006; Ziebart et al., 2008; Ho and Ermon, 2016) jointly learns a reward function and the corresponding optimal policy, such that the expected cumulative

condition 4, convergence, lemma 4, (13 more...)

arXiv.org Machine Learning

Jan-11-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Minnesota (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Instructional Material > Course Syllabus & Notes (0.94)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Statistical Learning (0.92)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found