GRIMGEP: Learning Progress for Robust Goal Sampling in Visual Deep Reinforcement Learning
Kovač, Grgur, Laversanne-Finot, Adrien, Oudeyer, Pierre-Yves
–arXiv.org Artificial Intelligence
Although recent work in reinforcement learning has shown that robots can learn complex individual skills such as grasping [2], locomotion [3, 4], and manipulation tasks [5], designing reinforcement learning algorithms that perform well in sparse reward scenarios is still an open challenge of artificial intelligence. Standard reinforcement learning algorithms struggle in the sparse reward scenario because they rely on simple exploration behavior such as random actions. As a result, learning complex tasks often requires manually collecting examples [6, 7] or running learning algorithms over a long period of time which may not be possible in real life scenarios. Designing better exploration schemes would help agents autonomously discover interesting features that can then be used to learn the long term objective. Developing efficient exploration algorithms would thus help create a more autonomous learning agent. Several approaches have been considered in order to improve the exploration performances of reinforcement learning algorithms. One approach is to reward the agent for discovering novel observations in the form of an intrinsic reward that is added to the original reward of the environment [8].
arXiv.org Artificial Intelligence
Aug-10-2020
- Country:
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Genre:
- Research Report (0.83)
- Technology: