Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures

Cao, Yuan, Gu, Quanquan, Belkin, Mikhail

arXiv.org Machine Learning 

In modern machine learning, complex models such as deep neural networks have received increasing popularity. These complicated models are known to be able to fit noisy training data sets, while at the same time achieving small test errors. In fact, this benign overfitting phenomenon is not a unique feature of deep learning. Even for kernel methods and linear models, Belkin et al. (2018) demonstrated that interpolators on the noisy training data can still perform near optimally on the test data. A series of recent works (Belkin et al., 2019b; Muthukumar et al., 2020b; Hastie et al., 2019; Bartlett et al., 2020) theoretically studied how over-parameterization can achieve small population risk. In particular in Bartlett et al. (2020) the authors considered the setting where the data are generated from a ground-truth linear model with noise, and established a tight population risk bound for the minimum norm linear interpolator with a matching lower bound. More recently, Tsigler and Bartlett (2020) further studied benign overfitting in ridge regression, and established non-asymptotic generalization bounds for over-parametrized ridge regression. They showed that those bounds are tight for a range of regularization parameter values. Notably, these results cover arbitrary covariance structure of the data, and give a nice characterization of how the spectrum of the data covariance matrix affects the population risk in the over-parameterized regime.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found