Quantifying Generalization in Reinforcement Learning

Cobbe, Karl, Klimov, Oleg, Hesse, Chris, Kim, Taehoon, Schulman, John

arXiv.org Machine Learning 

Generalizing between tasks remains difficult for state of the art deep reinforcement learning(RL) algorithms. Although trained agents can solve complex tasks, they struggle to transfer their experience to new environments. Agents that have mastered ten levels in a video game often fail catastrophically when first encountering the eleventh. Humans can seamlessly generalize across such similar tasks, but this ability is largely absent in RL agents. In short, agents become overly specialized to the environments encountered during training. That RL agents are prone to overfitting is widely appreciated, yet the most common RL benchmarks still encourage training and evaluating on the same set of environments. We believe there is a need for more metrics that evaluate generalization by explicitly separating training and test environments. In the same spirit as the Sonic Benchmark (Nichol et al., 2018), we seek to better quantify an agent's ability to generalize.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found