Implicit ridge regularization provided by the minimum-norm least squares estimator when $n\ll p$

Kobak, Dmitry, Lomond, Jonathan, Sanchez, Benoit

May-28-2018–arXiv.org Machine Learning

A conventional wisdom in statistical learning is that large models require strong regularization to prevent overfitting. This rule has been recently challenged by deep neural networks: despite being expressive enough to fit any training set perfectly, they still generalize well. Here we show that the same is true for linear regression in the under-determined $n\ll p$ situation, provided that one uses the minimum-norm estimator. The case of linear model with least squares loss allows full and exact mathematical analysis. We prove that augmenting a model with many random covariates with small constant variance and using minimum-norm estimator is asymptotically equivalent to adding the ridge penalty. Using toy example simulations as well as real-life high-dimensional data sets, we demonstrate that explicit ridge penalty often fails to provide any improvement over this implicit ridge regularization. In this regime, minimum-norm estimator achieves zero training error but nevertheless has low expected error.

deep learning, estimator, neural network, (19 more...)

arXiv.org Machine Learning

May-28-2018

arXiv.org PDF

Add feedback

Country:
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre:
- Research Report (0.64)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.35)
  - Statistical Learning > Regression (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found