Conformal inference is (almost) free for neural networks trained with early stopping

Liang, Ziyi, Zhou, Yanfei, Sesia, Matteo

arXiv.org Artificial Intelligence 

Deep neural networks can detect complex data patterns and leverage them to make accurate predictions in many applications, including computer vision, natural language processing, and speech recognition, to name a few examples. These models can sometimes even outperform skilled humans [1], but they still make mistakes. Unfortunately, the severity of these mistakes is compounded by the fact that the predictions computed by neural networks are often overconfident [2], partly due to overfitting [3, 4]. Several training strategies have been developed to mitigate overfitting, including dropout [5], batch normalization [6], weight normalization [7], data augmentation [8], and early stopping [9]; the latter is the focus of this paper. Early stopping consists of continuously evaluating after each batch of stochastic gradient updates (or epoch) the predictive performance of the current model on hold-out independent data. After a large number of gradient updates, only the intermediate model achieving the best performance on the hold-out data is utilized to make predictions. This strategy is often effective at mitigating overfitting and can produce relatively accurate predictions compared to fully trained models, but it does not fully resolve overconfidence because it does not lead to models with finite-sample guarantees.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found