Model Assessment and Selection under Temporal Distribution Shift
Han, Elise, Huang, Chengpiao, Wang, Kaizheng
–arXiv.org Artificial Intelligence
Statistical learning theory is traditionally founded on the assumption of a static data distribution, where statistical models are trained and deployed in the same environment. However, this assumption is often violated in practice, where the data distribution keeps changing over time. The temporal distribution shift can lead to serious decline in model performance post-deployment, which underlines the critical need to monitor models and detect potential degradation. Moreover, one often needs to choose among multiple candidate models originating from different learning algorithms (e.g., linear regression, random forests, neural networks) and hyperparameters (e.g., penalty parameter, step size, time window for training). Temporal distribution shift poses a major challenge to model selection, as past performance may not reliably predict future outcomes. Learners usually have to work with limited data from the current time period and abundant historical data, whose distributions may vary significantly.
arXiv.org Artificial Intelligence
Feb-13-2024
- Country:
- Asia > Middle East
- Lebanon (0.04)
- UAE > Dubai Emirate
- Dubai (0.04)
- Europe
- France
- Auvergne-Rhône-Alpes > Lyon
- Lyon (0.04)
- Hauts-de-France > Nord
- Lille (0.04)
- Auvergne-Rhône-Alpes > Lyon
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- France
- North America > United States
- California > San Diego County > San Diego (0.04)
- Asia > Middle East
- Genre:
- Research Report (0.64)
- Technology: