A Study on Overfitting in Deep Reinforcement Learning

Zhang, Chiyuan, Vinyals, Oriol, Munos, Remi, Bengio, Samy

Apr-20-2018–arXiv.org Machine Learning

Deep neural networks have proved to be effective function approximators in Reinforcement Learning (RL). Significant progress is seen in many RL problems ranging from board games like Go (Silver et al., 2016, 2017b), Chess and Shogi (Silver et al., 2017a), video games like Atari (Mnih et al., 2015) and StarCraft (Vinyals et al., 2017), to real world robotics and control tasks (Lillicrap et al., 2016). Most of these successes are due to improved training algorithms, carefully designed neural network architectures and powerful hardware. For example, in AlphaZero (Silver et al., 2017a), 5,000 1st-generation TPUs and 64 2nd-generation TPUs are used during self-play based training of agents with deep residual networks (He et al., 2016). On the other hand, learning with high-capacity models and long stretched training time on powerful devices could lead to potential risk of overfitting (Hardt et al., 2016; Lin et al., 2016). As a fundamental tradeoff in machine learning, preventing overfitting by properly controlling or regularizing the training is key to out-of-sample generalization. Studies of overfitting could be performed from the theory side, where generalization guarantees are derived for specific learning algorithms; or from the practice side, where carefully designed experimental protocols like cross validation are used as proxy to certify the generalization performance. Unfortunately, in the regime of deep RL, systematic studies of generalization behaviors from either theoretical or empirical perspectives are falling behind the rapid progresses from the algorithm development and application side. The current situation not only makes it difficult to understand the test behaviors like the vulnerabilities to potential adversarial attacks (Huang et al., 2017), but also renders some results difficult to reproduce or compare (Henderson et al., 2017; Machado et al., 2017).

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

Apr-20-2018

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.82)

Industry:
- Leisure & Entertainment > Games > Computer Games (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found