Quantifying Generalization in Reinforcement Learning
Cobbe, Karl, Klimov, Oleg, Hesse, Chris, Kim, Taehoon, Schulman, John
Generalizing between tasks remains difficult for state of the art deep reinforcement learning(RL) algorithms. Although trained agents can solve complex tasks, they struggle to transfer their experience to new environments. Agents that have mastered ten levels in a video game often fail catastrophically when first encountering the eleventh. Humans can seamlessly generalize across such similar tasks, but this ability is largely absent in RL agents. In short, agents become overly specialized to the environments encountered during training. That RL agents are prone to overfitting is widely appreciated, yet the most common RL benchmarks still encourage training and evaluating on the same set of environments. We believe there is a need for more metrics that evaluate generalization by explicitly separating training and test environments. In the same spirit as the Sonic Benchmark (Nichol et al., 2018), we seek to better quantify an agent's ability to generalize.
Dec-5-2018
- Country:
- North America > United States > Texas (0.14)
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Leisure & Entertainment > Games > Computer Games (0.66)
- Technology: