Reinforcement and Imitation Learning via Interactive No-Regret Learning
Ross, Stephane, Bagnell, J. Andrew
Recent work has demonstrated that problems-- particularly imitation learning and structured prediction-- where a learner's predictions influence the input-distribution it is tested on can be naturally addressed by an interactive approach and analyzed using no-regret online learning. These approaches to imitation learning, however, neither require nor benefit from information about the cost of actions. We extend existing results in two directions: first, we develop an interactive imitation learning approach that leverages cost information; second, we extend the technique to address reinforcement learning. The results provide theoretical support to the commonly observed successes of online approximate policy iteration. Our approach suggests a broad new family of algorithms and provides a unifying view of existing techniques for imitation and reinforcement learning.
Jun-23-2014
- Country:
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- Genre:
- Research Report (0.64)
- Industry:
- Education (0.36)
- Technology: