Asymptotics of Ridge(less) Regression under General Source Condition

Richards, Dominic, Mourtada, Jaouad, Rosasco, Lorenzo

arXiv.org Machine Learning 

Understanding the generalisation properties of Artificial Deep Neural Networks (ANN) has recently motivated a number of statistical questions. These models perform well in practice despite perfectly fitting (interpolating) the data, a property that seems at odds with classical statistical theory [49]. This has motivated the investigation of the generalisation performance of methods that achieve zero training error (interpolators) [32, 9, 11, 10, 8] and, in the context of linear least squares, the unique least norm solution to which gradient descent converges [22, 5, 37, 8, 21, 38, 20, 39]. Overparameterized linear models, where the number of variables exceed the number of points, are arguably the simplest and most natural setting where interpolation can be studied. Moreover, in certain regimes ANN can be approximated by suitable linear models [24, 17, 18, 2, 13]. The learning curve (test error versus model capacity) for interpolators has been shown to exhibit a characteristic "Double Descent" [1, 7] shape, where the test error decreases after peaking at the "interpolating" threshold, that is, the model capacity required to interpolate the data. The regime beyond this threshold naturally captures the settings of ANN [49], and thus, has motivated its investigation [36, 44, 39].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found