Asymptotics of Ridge(less) Regression under General Source Condition
Richards, Dominic, Mourtada, Jaouad, Rosasco, Lorenzo
Understanding the generalisation properties of Artificial Deep Neural Networks (ANN) has recently motivated a number of statistical questions. These models perform well in practice despite perfectly fitting (interpolating) the data, a property that seems at odds with classical statistical theory [49]. This has motivated the investigation of the generalisation performance of methods that achieve zero training error (interpolators) [32, 9, 11, 10, 8] and, in the context of linear least squares, the unique least norm solution to which gradient descent converges [22, 5, 37, 8, 21, 38, 20, 39]. Overparameterized linear models, where the number of variables exceed the number of points, are arguably the simplest and most natural setting where interpolation can be studied. Moreover, in certain regimes ANN can be approximated by suitable linear models [24, 17, 18, 2, 13]. The learning curve (test error versus model capacity) for interpolators has been shown to exhibit a characteristic "Double Descent" [1, 7] shape, where the test error decreases after peaking at the "interpolating" threshold, that is, the model capacity required to interpolate the data. The regime beyond this threshold naturally captures the settings of ANN [49], and thus, has motivated its investigation [36, 44, 39].
Jun-11-2020
- Country:
- Europe > United Kingdom
- England (0.14)
- North America > United States
- Massachusetts (0.14)
- Europe > United Kingdom
- Genre:
- Research Report (0.50)
- Technology: