AITopics | adafocal

Overconfidence in deep neural networks could easily lead to deployments where predictions are made that should have been withheld. Figure 7: ResNet-50 trained onCIFAR-10 using focal lossγ = 0,3,4,5. Similarly, the confidence of the top predicted classˆy (for the training sample) isdenoted byˆptrain,top and theaverage equivalent inabinbyCtrain,top. Forthe training set, we care only about the confidence ofthe "true class"ˆptrain,true asthat isthe quantity which gets manipulated by some loss function. For validation set, on the other hand, we care about the confidence of the "top predicted class".

artificial intelligence, eceem, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

AdaFocal: Calibration-awareAdaptiveFocalLoss

Neural Information Processing SystemsFeb-7-2026, 09:44:00 GMT

This success stems from focal loss regularizing the entropyofthe model'sprediction (controlled by the parameterγ), thereby reining in the model's overconfidence.

artificial intelligence, aval, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

AdaFocal: Calibration-aware Adaptive Focal Loss

Neural Information Processing SystemsDec-23-2025, 18:08:18 GMT

Much recent work has been devoted to the problem of ensuring that a neural network's confidence scores match the true probability of being correct, i.e. the calibration problem. Of note, it was found that training with focal loss leads to better calibration than cross-entropy while achieving similar level of accuracy \cite{mukhoti2020}. This success stems from focal loss regularizing the entropy of the model's prediction (controlled by the parameter $\gamma$), thereby reining in the model's overconfidence. Further improvement is expected if $\gamma$ is selected independently for each training sample (Sample-Dependent Focal Loss (FLSD-53) \cite{mukhoti2020}). However, FLSD-53 is based on heuristics and does not generalize well. In this paper, we propose a calibration-aware adaptive focal loss called AdaFocal that utilizes the calibration properties of focal (and inverse-focal) loss and adaptively modifies $\gamma_t$ for different groups of samples based on $\gamma_{t-1}$ from the previous step and the knowledge of model's under/over-confidence on the validation set. We evaluate AdaFocal on various image recognition and one NLP task, covering a wide variety of network architectures, to confirm the improvement in calibration while achieving similar levels of accuracy. Additionally, we show that models trained with AdaFocal achieve a significant boost in out-of-distribution detection.

adafocal, calibration-aware adaptive focal loss, name change, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

AdaFocal: Calibration-aware Adaptive Focal Loss

Neural Information Processing SystemsOct-9-2024, 14:25:05 GMT

Much recent work has been devoted to the problem of ensuring that a neural network's confidence scores match the true probability of being correct, i.e. the calibration problem. Of note, it was found that training with focal loss leads to better calibration than cross-entropy while achieving similar level of accuracy \cite{mukhoti2020}. This success stems from focal loss regularizing the entropy of the model's prediction (controlled by the parameter \gamma), thereby reining in the model's overconfidence. Further improvement is expected if \gamma is selected independently for each training sample (Sample-Dependent Focal Loss (FLSD-53) \cite{mukhoti2020}). However, FLSD-53 is based on heuristics and does not generalize well. In this paper, we propose a calibration-aware adaptive focal loss called AdaFocal that utilizes the calibration properties of focal (and inverse-focal) loss and adaptively modifies \gamma_t for different groups of samples based on \gamma_{t-1} from the previous step and the knowledge of model's under/over-confidence on the validation set.

adafocal, calibration-aware adaptive focal loss, similar level, (3 more...)

Neural Information Processing Systems

Genre: Play > Prospect > Container > Reservoir (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

AdaFocal: Calibration-aware Adaptive Focal Loss

Ghosh, Arindam, Schaaf, Thomas, Gormley, Matthew R.

arXiv.org Artificial IntelligenceJun-16-2023

Much recent work has been devoted to the problem of ensuring that a neural network's confidence scores match the true probability of being correct, i.e. the calibration problem. Of note, it was found that training with focal loss leads to better calibration than cross-entropy while achieving similar level of accuracy \cite{mukhoti2020}. This success stems from focal loss regularizing the entropy of the model's prediction (controlled by the parameter $\gamma$), thereby reining in the model's overconfidence. Further improvement is expected if $\gamma$ is selected independently for each training sample (Sample-Dependent Focal Loss (FLSD-53) \cite{mukhoti2020}). However, FLSD-53 is based on heuristics and does not generalize well. In this paper, we propose a calibration-aware adaptive focal loss called AdaFocal that utilizes the calibration properties of focal (and inverse-focal) loss and adaptively modifies $\gamma_t$ for different groups of samples based on $\gamma_{t-1}$ from the previous step and the knowledge of model's under/over-confidence on the validation set. We evaluate AdaFocal on various image recognition and one NLP task, covering a wide variety of network architectures, to confirm the improvement in calibration while achieving similar levels of accuracy. Additionally, we show that models trained with AdaFocal achieve a significant boost in out-of-distribution detection.

machine learning, natural language, val, (17 more...)

arXiv.org Artificial Intelligence

2211.11838

Country: