A Modern Take on the Bias-Variance Tradeoff in Neural Networks

Neal, Brady, Mittal, Sarthak, Baratin, Aristide, Tantia, Vinayak, Scicluna, Matthew, Lacoste-Julien, Simon, Mitliagkas, Ioannis

Oct-19-2018–arXiv.org Machine Learning

We revisit the bias-variance tradeoff for neural networks in light of modern empirical findings. The traditional bias-variance tradeoff in machine learning suggests that as model complexity grows, variance increases. Classical bounds in statistical learning theory point to the number of parameters in a model as a measure of model complexity, which means the tradeoff would indicate that variance increases with the size of neural networks. However, we empirically find that variance due to training set sampling is roughly constant (with both width and depth) in practice. Variance caused by the non-convexity of the loss landscape is different. We find that it decreases with width and increases with depth, in our setting. We provide theoretical analysis, in a simplified setting inspired by linear models, that is consistent with our empirical findings for width. We view bias-variance as a useful lens to study generalization through and encourage further theoretical explanation from this perspective. The traditional view in machine learning is that increasingly complex models achieve lower bias at the expense of higher variance. This balance between underfitting (high bias) and overfitting (high variance) is commonly known as the bias-variance tradeoff (Figure 1). In their landmark work that initially highlighted this bias-variance dilemma in machine learning, Geman et al. (1992) suggest that larger neural networks suffer from higher variance.

artificial intelligence, machine learning, variance, (14 more...)

arXiv.org Machine Learning

Oct-19-2018

arXiv.org PDF

Add feedback

Country:
- Europe (1.00)
- North America > United States
  - California (0.28)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (1.00)
  - Neural Networks > Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found