task distribution
Aligning Validation with Deployment: Target-Weighted Cross-Validation for Spatial Prediction
Brenning, Alexander, Suesse, Thomas
Cross-validation (CV) is commonly used to estimate predictive risk when independent test data are unavailable. Its validity depends on the assumption that validation tasks are sampled from the same distribution as prediction tasks encountered during deployment. In spatial prediction and other settings with structured data, this assumption is frequently violated, leading to biased estimates of deployment risk. We propose Target-Weighted CV (TWCV), an estimator of deployment risk that accounts for discrepancies between validation and deployment task distributions, thus accounting for (1) covariate shift and (2) task-difficulty shift. We characterize prediction tasks by descriptors such as covariates and spatial configuration. TWCV assigns weights to validation losses such that the weighted empirical distribution of validation tasks matches the corresponding distribution over a target domain. The weights are obtained via calibration weighting, yielding an importance-weighted estimator that targets deployment risk. Since TWCV requires adequate coverage of the deployment distribution's support, we combine it with spatially buffered resampling that diversifies the task difficulty distribution. In a simulation study, conventional as well as spatial estimators exhibit substantial bias depending on sampling, whereas buffered TWCV remains approximately unbiased across scenarios. A case study in environmental pollution mapping further confirms that discrepancies between validation and deployment task distributions can affect performance assessment, and that buffered TWCV better reflects the prediction task over the target domain. These results establish task distribution mismatch as a primary source of CV bias in spatial prediction and show that calibration weighting combined with a suitable validation task generator provides a viable approach to estimating predictive risk under dataset shift.
- Europe > Germany (0.14)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- North America > United States > Illinois > Champaign County > Champaign (0.04)
- North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
- Health & Medicine (0.45)
- Food & Agriculture (0.45)
- Government (0.45)
- North America > United States > California (0.14)
- North America > United States > Maryland > Baltimore (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- North America > United States > California (0.14)
- North America > United States > Michigan (0.04)
- North America > Canada (0.04)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- Research Report (0.46)
- Instructional Material > Course Syllabus & Notes (0.40)
- North America > United States > Alaska (0.05)
- North America > United States > Pennsylvania (0.05)
- North America > United States > Texas (0.04)
- (5 more...)
- Health & Medicine (1.00)
- Information Technology (0.68)
- Government > Regional Government > North America Government > United States Government (0.68)
- Government > Military (0.68)
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- (2 more...)