Generating Real-Time Crowd Advice to Improve Reinforcement Learning Agents
Cruz, Gabriel Victor de la (Washington State University) | Peng, Bei (Washington State University) | Lasecki, Walter Stephen (University of Rochester) | Taylor, Matthew Edmund (Washington State University)
Reinforcement learning is a powerful machine learning paradigm that allows agents to autonomously learn to maximize a scalar reward. However, it often suffers from poor initial performance and long learning times. This paper discusses how collecting online human feedback, both in real time and post hoc, can potentially improve the performance of such learning systems. We use the game Pac-Man to simulate a navigation setting and show that workers are able to accurately identify both when a sub-optimal action is executed, and what action should have been performed instead. Our results demonstrate that the crowd is capable of generating helpful input. We conclude with a discussion the types of errors that occur most commonly when engaging human workers for this task, and a discussion of how such data could be used to improve learning. Our work serves as a critical first step in designing systems that use real-time human feedback to improve the learning performance of automated systems on-the-fly. Figure 1: This screenshot shows the web interface of the user study with game layout, and components of the Pac-Man game: 1) Pac-Man, 2) 4 Ghosts, 3) Pills, and 4) Power Pills.
Mar-1-2015
- Country:
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- California > San Mateo County
- Menlo Park (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New York > New York County
- New York City (0.04)
- Washington (0.05)
- Wisconsin > Dane County
- Madison (0.04)
- California > San Mateo County
- Asia > Middle East
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Leisure & Entertainment > Games > Computer Games (0.78)
- Technology: