Surprise and Curiosity for Big Data Robotics
White, Adam (University of Alberta) | Modayil, Joseph (University of Alberta) | Sutton, Richard S. (University of Alberta)
This paper introduces a new perspective on curiosity and intrinsic motivation, viewed as the problem of generating behavior data for parallel off-policy learning.We provide 1) the first measure of surprise based on off-policy general value function learning progress, 2) the first investigation of reactive behavior control with parallel gradient temporal difference learning and function approximation, and 3) the first demonstration of using curiosity driven control to react to a non-stationary learning task---all on a mobile robot. Our approach improves scalability over previous off-policy, robot learning systems, essential for making progress on the ultimate big-data decision making problem---life-long robot learning.
Jul-22-2014
- Technology: