Automated curriculum generation for Policy Gradients from Demonstrations

Srinivasan, Anirudh, Bahdanau, Dzmitry, Chevalier-Boisvert, Maxime, Bengio, Yoshua

Dec-1-2019–arXiv.org Artificial Intelligence

In this paper, we present a technique that improves the process of training an agent (using RL) for instruction following. We develop a training curriculum that uses a nominal number of expert demonstrations and trains the agent in a manner that draws parallels from one of the ways in which humans learn to perform complex tasks, i.e by starting from the goal and working backwards. We test our method on the BabyAI platform and show an improvement in sample efficiency for some of its tasks compared to a PPO (proximal policy optimization) baseline.

agent, curriculum, demonstration, (14 more...)

arXiv.org Artificial Intelligence

Dec-1-2019

arXiv.org PDF

Add feedback

Country:
- North America
  - United States (0.04)
  - Canada > Quebec
    - Montreal (0.05)

Genre:
- Research Report (0.82)
- Instructional Material > Course Syllabus & Notes (0.34)

Industry:
- Education (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (0.72)
  - Robots (0.69)
  - Representation & Reasoning > Agents (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found