Asymptotics of K-Fold Cross Validation
Li, Jessie (a:1:{s:5:"en_US";s:36:"University of California, Santa Cruz";})
–Journal of Artificial Intelligence Research
This paper investigates the asymptotic distribution of the K-fold cross validation error in an i.i.d. setting. As the number of observations n goes to infinity while keeping the number of folds K fixed, the K-fold cross validation error is √ n-consistent for the expected out-of-sample error and has an asymptotically normal distribution. A consistent estimate of the asymptotic variance is derived and used to construct asymptotically valid confidence intervals for the expected out-of-sample error. A hypothesis test is developed for comparing two estimators’ expected out-of-sample errors and a subsampling procedure is used to obtain critical values. Monte Carlo simulations demonstrate the asymptotic validity of our confidence intervals for the expected out-of-sample error and investigate the size and power properties of our test. In our empirical application, we use our estimator selection test to compare the out-of-sample predictive performance of OLS, Neural Networks, and Random Forests for predicting the sale price of a domain name in a GoDaddy expiry auction.
Journal of Artificial Intelligence Research
Nov-14-2023
- Country:
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States
- California > Santa Cruz County > Santa Cruz (0.14)
- Europe > United Kingdom
- Genre:
- Overview (1.00)
- Research Report (1.00)
- Technology: