Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks
Pérez-Dattari, Rodrigo, Celemin, Carlos, Ruiz-del-Solar, Javier, Kober, Jens
Deep Reinforcement Learning (DRL) has obtained unprecedented results in decisionmaking problems, such as playing Atari games [1], or beating the world champion in GO [2]. Nevertheless, in robotic problems, DRL is still limited in applications with real-world systems [3]. Most of the tasks that have been successfully addressed with DRL have two common characteristics: 1) they have well-specified reward functions, and 2) they require large amounts of trials, which means long training periods (or powerful computers) to obtain a satisfying behavior. These two characteristics can be problematic in cases where 1) the goals of the tasks are poorly defined or hard to specify/model (reward function does not exist), 2) the execution of many trials is not feasible (real systems case) and/or not much computational power or time is available, and 3) sometimes additional external perception is necessary for computing the reward/cost function. On the other hand, Machine Learning methods that rely on transfer of human knowledge, Interactive Machine Learning (IML) methods, have shown to be time efficient for obtaining good performance policies and may not require a well-specified reward function; moreover, some methods do not need expert human teachers for training high performance agents [4-6].
Sep-30-2018
- Country:
- South America > Chile (0.16)
- Europe > Netherlands (0.15)
- Genre:
- Research Report (0.64)
- Industry:
- Leisure & Entertainment > Games
- Computer Games (0.54)
- Education > Educational Setting
- Online (0.41)
- Leisure & Entertainment > Games
- Technology: