Cross-validation: what does it estimate and how well does it do it?
Bates, Stephen, Hastie, Trevor, Tibshirani, Robert
When deploying a predictive model, it is important to understand its prediction accuracy on future test points, so both good point estimates and accurate confidence intervals for prediction error are essential. Cross-validation (CV) is a widely-used approach for these two tasks, but in spite of its seeming simplicity, its operating properties remain opaque. Considering first estimation, it turns out be challenging to precisely state the estimand corresponding to the cross-validation point estimate. In this work, we show that the the estimand of CV is not the accuracy of the model fit on the data at hand, but is instead the average accuracy over many hypothetical data sets. Specifically, we show that the CV estimate of error has larger mean squared error (MSE) when estimating the prediction error of the final model than when estimating the average prediction error of models across many unseen data sets for the special case of linear regression. Turning to confidence intervals for prediction error, we show that naïve intervals based on CV can fail badly, giving coverage far below the nominal level; we provide a simple example soon in Section 1.1. The source of this behavior is the estimation of the variance used to compute the width of the interval: it does not account for the correlation between the error estimates in different folds, which arises because each data point is used for both training and testing. As a result, the estimate of variance is too small and the intervals are too narrow. To address this issue, we develop a modification of cross-validation, nested cross-validation (NCV), that achieves coverage near the nominal level, even in challenging cases where the usual cross-validation intervals have miscoverage rates two to three times larger than the nominal rate.
Apr-14-2021
- Country:
- North America > United States > California (0.28)
- Genre:
- Research Report > New Finding (0.93)
- Industry:
- Health & Medicine (0.92)
- Technology: