Two models of double descent for weak features
Belkin, Mikhail, Hsu, Daniel, Xu, Ji
The "double descent" risk curve was proposed by Belkin, Hsu, Ma, and Mandal [Bel 18] to qualitatively describe the out-of-sample prediction performance of several variably-parameterized machine learning models. This risk curve reconciles the classical bias-variance tradeoff with the behavior of predictive models that interpolate training data, as observed for several model families (including neural networks) in a wide variety of applications [BO98; AS17; Spi 18; Bel 18]. In these studies, a predictive model with p parameters is fit to a training sample of size n, and the test risk (i.e., out-of-sample error) is examined as a function of p. When p is below the sample size n, the test risk is governed by the usual bias-variance decomposition. As p is increased towards n, the training risk (i.e., in-sample error) is driven to zero, but the test risk shoots up towards infinity.
Mar-18-2019
- Country:
- North America > United States > Ohio (0.14)
- Genre:
- Research Report > New Finding (0.34)
- Technology: