Model Assessment and Selection under Temporal Distribution Shift

Han, Elise, Huang, Chengpiao, Wang, Kaizheng

Feb-13-2024–arXiv.org Artificial Intelligence

Statistical learning theory is traditionally founded on the assumption of a static data distribution, where statistical models are trained and deployed in the same environment. However, this assumption is often violated in practice, where the data distribution keeps changing over time. The temporal distribution shift can lead to serious decline in model performance post-deployment, which underlines the critical need to monitor models and detect potential degradation. Moreover, one often needs to choose among multiple candidate models originating from different learning algorithms (e.g., linear regression, random forests, neural networks) and hyperparameters (e.g., penalty parameter, step size, time window for training). Temporal distribution shift poses a major challenge to model selection, as past performance may not reliably predict future outcomes. Learners usually have to work with limited data from the current time period and abundant historical data, whose distributions may vary significantly.

distribution shift, lemma 3, probability, (16 more...)

arXiv.org Artificial Intelligence

Feb-13-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California > San Diego County > San Diego (0.04)
- Europe
  - United Kingdom > England
    - Oxfordshire > Oxford (0.04)
  - France
    - Hauts-de-France > Nord
      - Lille (0.04)
    - Auvergne-Rhône-Alpes > Lyon
      - Lyon (0.04)
- Asia > Middle East
  - Lebanon (0.04)
  - UAE > Dubai Emirate
    - Dubai (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)