ROTI-GCV: Generalized Cross-Validation for right-ROTationally Invariant Data
Luo, Kevin, Li, Yufan, Sur, Pragya
Two key tasks in high-dimensional regularized regression are tuning the regularization strength for good predictions and estimating the out-of-sample risk. It is known that the standard approach -- $k$-fold cross-validation -- is inconsistent in modern high-dimensional settings. While leave-one-out and generalized cross-validation remain consistent in some high-dimensional cases, they become inconsistent when samples are dependent or contain heavy-tailed covariates. To model structured sample dependence and heavy tails, we use right-rotationally invariant covariate distributions - a crucial concept from compressed sensing. In the common modern proportional asymptotics regime where the number of features and samples grow comparably, we introduce a new framework, ROTI-GCV, for reliably performing cross-validation. Along the way, we propose new estimators for the signal-to-noise ratio and noise variance under these challenging conditions. We conduct extensive experiments that demonstrate the power of our approach and its superiority over existing methods.
Jun-17-2024
- Country:
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.14)
- North America > United States
- New York > New York County > New York City (0.04)
- Europe > United Kingdom
- Genre:
- Research Report (0.82)
- Technology: