Dyna-AIL : Adversarial Imitation Learning by Planning
Saxena, Vaibhav, Sivanandan, Srinivasan, Mathur, Pulkit
–arXiv.org Artificial Intelligence
Adversarial methods for imitation learning have been shown to perform well on various control tasks. However, they require a large number of environment interactions for convergence. In this paper, we propose an end-to-end differentiable adversarial imitation learning algorithm in a Dyna-like framework for switching between model-based planning and model-free learning from expert data. Our results on both discrete and continuous environments show that our approach of using model-based planning along with model-free learning converges to an optimal policy with fewer number of environment interactions in comparison to the state-of-the-art learning methods.
arXiv.org Artificial Intelligence
Mar-7-2019
- Country:
- North America > Canada > Ontario > Toronto (0.31)
- Genre:
- Research Report > New Finding (0.48)
- Technology: