On the Optimal Weighted \ell_2 Regularization in Overparameterized Linear Regression
–Neural Information Processing Systems
Our general setup leads to a number of interesting findings. We outline precise conditions that decide the sign of the optimal setting \lambda_{\opt} for the ridge parameter \lambda and confirm the implicit \ell_2 regularization effect of overparameterization, which theoretically justifies the surprising empirical observation that \lambda_{\opt} can be \textit{negative} in the overparameterized regime. We also characterize the double descent phenomenon for principal component regression (PCR) when \vX and \vbeta_{\star} are both anisotropic. Finally, we determine the optimal weighting matrix \vSigma_w for both the ridgeless ( \lambda\to 0) and optimally regularized ( \lambda \lambda_{\opt}) case, and demonstrate the advantage of the weighted objective over standard ridge regression and PCR.
Neural Information Processing Systems
Oct-10-2024, 12:57:41 GMT
- Technology: