Failures of model-dependent generalization bounds for least-norm interpolation

Oct-30-2020–arXiv.org Machine Learning

Deep learning methodology has revealed some striking deficiencies of classical statistical learning theory: large neural networks, trained to zero empirical risk on noisy training data, have good predictive accuracy on independent test data. These methods are overfitting (that is, fitting to the training data better than the noise should allow), but the overfitting is benign (that is, prediction performance is good). It is an important open problem to understand why this is possible. The presence of noise is key to why the success of interpolating algorithms is mysterious. Generalization of algorithms that produce a perfect fit in the absence of noise has been studied for decades (see [Haussler, 1992] and its references). A number of recent papers have provided generalization bounds for interpolating algorithms in the absence of noise, either for deep networks or in abstract frameworks motivated by deep networks [Li and Liang, 2018, Arora et al., 2019, Cao and Gu, 2019, Feldman, 2020]. The generalization bounds in these papers either do not hold or become vacuous in the presence of noise: Assumption A1 in [Li and Liang, 2018] rules out noisy data; the data-dependent bound in Arora et al. [2019, Theorem 5.1] becomes vacuous when independent noise is added to the y

artificial intelligence, generalization, machine learning, (18 more...)

arXiv.org Machine Learning

Oct-30-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (1.00)
  - Performance Analysis > Accuracy (0.49)
  - Neural Networks > Deep Learning (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found