The role of regularization in classification of high-dimensional noisy Gaussian mixture
Mignacco, Francesca, Krzakala, Florent, Lu, Yue M., Zdeborová, Lenka
We consider a high-dimensional mixture of two Gaussians in the noisy regime where even an oracle knowing the centers of the clusters misclassifies a small but finite fraction of the points. We provide a rigorous analysis of the generalization error of regularized convex classifiers, including ridge, hinge and logistic regression, in the high-dimensional limit where the number $n$ of samples and their dimension $d$ go to infinity while their ratio is fixed to $\alpha= n/d$. We discuss surprising effects of the regularization that in some cases allows to reach the Bayes-optimal performances. We also illustrate the interpolation peak at low regularization, and analyze the role of the respective sizes of the two clusters.
Feb-26-2020
- Country:
- North America > United States
- New York (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Europe
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- France > Île-de-France
- United Kingdom > England
- Asia > Middle East
- Israel (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (0.48)
- Technology: