Mastering Rate based Curriculum Learning
Willems, Lucas, Lahlou, Salem, Bengio, Yoshua
Recently, deep reinforcement learning algorithms have been successfully applied to a wide range of domains ([1], [2], [3], [4]). However, their success relies heavily on dense rewards being given to the agent; and learning in environments with sparse rewards is still a major limitation of RL due to the low sample efficiency of the current algorithms in such scenarios. In sparse rewards settings, the sample inefficiency is essentially caused by the low likelihood of the agent obtaining a reward by random exploration. Recent attempts to tackle this issue revolve around providing the agent an intrinsic reward that encourages exploring new states of the environment, thus increasing the likelihood of reaching the reward ([5], [6], [7]). An alternative way to improve the sample efficiency is curriculum learning ([8]). It consists in first training the agent on an easy version of the task at hand, where it can get reward more easily and learn, then training on increasingly difficult versions using the previously learned policy and finally, training on the task at hand. Its usage is not limited to reinforcement learning and robotics tasks, but also to supervised tasks. Curriculum learning may be decomposed into two parts: 1. Defining the curriculum, i.e. the set of tasks the learner may be trained on.
Aug-14-2020
- Country:
- Oceania > Australia
- New South Wales > Sydney (0.04)
- North America
- United States > California
- Los Angeles County > Long Beach (0.04)
- Canada > Quebec
- Montreal (0.04)
- United States > California
- Oceania > Australia
- Genre:
- Research Report (0.64)
- Industry:
- Education (1.00)
- Leisure & Entertainment > Games
- Computer Games (0.46)
- Technology: