Conformal inference is (almost) free for neural networks trained with early stopping
Liang, Ziyi, Zhou, Yanfei, Sesia, Matteo
–arXiv.org Artificial Intelligence
Deep neural networks can detect complex data patterns and leverage them to make accurate predictions in many applications, including computer vision, natural language processing, and speech recognition, to name a few examples. These models can sometimes even outperform skilled humans [1], but they still make mistakes. Unfortunately, the severity of these mistakes is compounded by the fact that the predictions computed by neural networks are often overconfident [2], partly due to overfitting [3, 4]. Several training strategies have been developed to mitigate overfitting, including dropout [5], batch normalization [6], weight normalization [7], data augmentation [8], and early stopping [9]; the latter is the focus of this paper. Early stopping consists of continuously evaluating after each batch of stochastic gradient updates (or epoch) the predictive performance of the current model on hold-out independent data. After a large number of gradient updates, only the intermediate model achieving the best performance on the hold-out data is utilized to make predictions. This strategy is often effective at mitigating overfitting and can produce relatively accurate predictions compared to fully trained models, but it does not fully resolve overconfidence because it does not lead to models with finite-sample guarantees.
arXiv.org Artificial Intelligence
Jun-26-2023
- Country:
- North America > United States
- Tennessee (0.04)
- California > Los Angeles County
- Los Angeles (0.28)
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report (1.00)
- Technology: