Diagnostic Tool for Out-of-Sample Model Evaluation
Hult, Ludvig, Zachariah, Dave, Stoica, Petre
Assessment of model fitness is a key part of machine learning. The standard paradigm of model evaluation is analysis of the average loss over future data. This is often explicit in model fitting, where we select models that minimize the average loss over training data as a surrogate, but comes with limited theoretical guarantees. In this paper, we consider the problem of characterizing a batch of out-of-sample losses of a model using a calibration data set. We provide finite-sample limits on the out-of-sample losses that are statistically valid under quite general conditions and propose a diagonistic tool that is simple to compute and interpret. Several numerical experiments are presented to show how the proposed method quantifies the impact of distribution shifts, aids the analysis of regression, and enables model selection as well as hyperparameter tuning.
Oct-16-2023
- Country:
- North America
- Canada > Quebec (0.14)
- United States > New York (0.14)
- North America
- Genre:
- Research Report (0.82)
- Industry:
- Health & Medicine (0.88)
- Technology: