Calibration through the Lens of Interpretability

Torabian, Alireza, Urner, Ruth

arXiv.org Artificial Intelligence 

In many applications it is important that a classification model not only has high accuracy but that a user is also provided with a reliable estimate of confidence in the predicted label. Calibration is a concept that is often invoked to provide such confidence estimates to a user. As such, calibration is a notion that is inherently aimed at human interpretation. In binary classification, a perfectly calibrated model f provides the guarantee that if it predicts fpxq " p on some instance x, then among the set of all instances on which f assigns this value p the probability of label 1 is indeed p (and the probability of label 0 thus 1 p). While calibration is generally considered useful, we would argue that in many cases, even if achieved, it is doomed to fail at its original goal of providing insight to a human user: for most suitably complex classification models, a human user that observes fpxq " p has no notion of the set of all instances on which f also outputs p.