Inference for the Generalization Error

Dec-31-2000–Neural Information Processing Systems

In order to to compare learning algorithms, experimental results reported in the machine learning litterature often use statistical tests of significance. Unfortunately,most of these tests do not take into account the variability due to the choice of training set. We perform a theoretical investigation of the variance of the cross-validation estimate of the generalization errorthat takes into account the variability due to the choice of training sets. This allows us to propose two new ways to estimate this variance. We show, via simulations, that these new statistics perform well relative to the statistics considered by Dietterich (Dietterich, 1998). 1 Introduction When applying a learning algorithm (or comparing several algorithms), one is typically interested in estimating its generalization error. Its point estimation is rather trivial through cross-validation. Providing a variance estimate of that estimation, so that hypothesis testing and/orconfidence intervals are possible, is more difficult, especially, as pointed out in (Hinton et aI., 1995), if one wants to take into account the variability due to the choice of the training sets (Breiman, 1996). A notable effort in that direction is Dietterich's work (Dietterich, 1998).Careful investigation of the variance to be estimated allows us to provide new variance estimates, which tum out to perform well. Let us first layout the framework in which we shall work.

algorithm, dietterich, variance, (13 more...)

Neural Information Processing Systems

Dec-31-2000

Conferences PDF

Add feedback

Country:
- North America > Canada
  - Ontario > Toronto (0.14)
  - Quebec > Montreal (0.05)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)

Duplicate Docs Excel Report

Title
Inference for the Generalization Error
Inference for the Generalization Error

Similar Docs Excel Report more

Title	Similarity	Source
None found