An Entropic Metric for Measuring Calibration of Machine Learning Models
Sumler, Daniel James, Devlin, Lee, Maskell, Simon, Lane, Richard O.
–arXiv.org Artificial Intelligence
--Understanding the confidence with which a machine learning model classifies an input datum is an important, and perhaps under-investigated, concept. In this paper, we propose a new calibration metric, the Entropic Calibration Difference (ECD). Based on existing research in the field of state estimation, specifically target tracking (TT), we show how ECD may be applied to binary classification machine learning models. We describe the relative importance of under-and over-confidence and how they are not conflated in the TT literature. We consider this important given that algorithms that are under-confident are likely to be "safer" than algorithms that are over-confident, albeit at the expense of also being over-cautious and so statistically inefficient. We demonstrate how this new metric performs on real and simulated data and compare with other metrics for machine learning model probability calibration, including the Expected Calibration Error (ECE) and its signed counterpart, the Expected Signed Calibration Error (ESCE). Calibration of probabilities is an important and often-overlooked concept when developing machine learning (ML) models. Usually, accuracy is the main metric used to calculate how well an ML model performs in terms of predicting a class for unseen data. Generally speaking, the closer the accuracy is to 100%, the better the model is deemed to be. However, this does not take into account the probability of predictions that the model outputs, which can be just as important, if not more, than the accuracy.
arXiv.org Artificial Intelligence
Feb-20-2025
- Country:
- Europe > United Kingdom
- England > Merseyside > Liverpool (0.04)
- North America > United States
- California > San Diego County
- San Diego (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- California > San Diego County
- South America > Brazil
- Rio Grande do Sul (0.04)
- Europe > United Kingdom
- Genre:
- Research Report (0.64)
- Industry:
- Health & Medicine (0.68)
- Technology: