Conformal inference is (almost) free for neural networks trained with early stopping

Liang, Ziyi, Zhou, Yanfei, Sesia, Matteo

Jun-26-2023–arXiv.org Artificial Intelligence

Deep neural networks can detect complex data patterns and leverage them to make accurate predictions in many applications, including computer vision, natural language processing, and speech recognition, to name a few examples. These models can sometimes even outperform skilled humans [1], but they still make mistakes. Unfortunately, the severity of these mistakes is compounded by the fact that the predictions computed by neural networks are often overconfident [2], partly due to overfitting [3, 4]. Several training strategies have been developed to mitigate overfitting, including dropout [5], batch normalization [6], weight normalization [7], data augmentation [8], and early stopping [9]; the latter is the focus of this paper. Early stopping consists of continuously evaluating after each batch of stochastic gradient updates (or epoch) the predictive performance of the current model on hold-out independent data. After a large number of gradient updates, only the intermediate model achieving the best performance on the hold-out data is utilized to make predictions. This strategy is often effective at mitigating overfitting and can produce relatively accurate predictions compared to fully trained models, but it does not fully resolve overconfidence because it does not lead to models with finite-sample guarantees.

artificial intelligence, machine learning, standard error, (16 more...)

arXiv.org Artificial Intelligence

Jun-26-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Tennessee (0.04)
  - California > Los Angeles County
    - Los Angeles (0.28)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning > Gradient Descent (0.89)
  - Neural Networks > Deep Learning (0.87)
  - Performance Analysis > Accuracy (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found