Measuring and Characterizing Generalization in Deep Reinforcement Learning
Witty, Sam, Lee, Jun Ki, Tosch, Emma, Atrey, Akanksha, Littman, Michael, Jensen, David
–arXiv.org Artificial Intelligence
Deep reinforcement-learning methods have achieved remarkable performance on challenging control tasks. Observations of the resulting behavior give the impression that the agent has constructed a generalized representation that supports insightful action decisions. We re-examine what is meant by generalization in RL, and propose several definitions based on an agent's performance in on-policy, off-policy, and unreachable states. We propose a set of practical methods for evaluating agents with these definitions of generalization. We demonstrate these techniques on a common benchmark task for deep RL, and we show that the learned networks make poor decisions for states that differ only slightly from on-policy states, even though those states are not selected adversarially. Taken together, these results call into question the extent to which deep Q-networks learn generalized representations, and suggest that more experimentation and analysis is necessary before claims of representation learning can be supported.
arXiv.org Artificial Intelligence
Dec-11-2018
- Country:
- North America > United States > Massachusetts (0.14)
- Genre:
- Research Report
- Experimental Study (0.35)
- New Finding (0.34)
- Research Report
- Industry:
- Leisure & Entertainment > Games > Computer Games (0.69)
- Technology: