TCE: A Test-Based Approach to Measuring Calibration Error

Matsubara, Takuo, Tax, Niek, Mudd, Richard, Guy, Ido

Jun-25-2023–arXiv.org Artificial Intelligence

While a number of metrics--such as log-likelihood, userspecified This paper proposes a new metric to measure the scoring functions, and the area under the receiver calibration error of probabilistic binary classifiers, operating characteristic (ROC) curve--are used to assess the called test-based calibration error (TCE). TCE incorporates quality of probabilistic classifiers, it is usually hard or even a novel loss function based on a statistical impossible to gauge whether predictions are well-calibrated test to examine the extent to which model predictions from the values of these metrics. For assessment of calibration, differ from probabilities estimated from it is typically necessary to use a metric that measures data. It offers (i) a clear interpretation, (ii) a consistent calibration error, that is, a deviation between model predictions scale that is unaffected by class imbalance, and and probabilities of target occurrences estimated from (iii) an enhanced visual representation with repect data. The importance of assessing calibration error has been to the standard reliability diagram. In addition, we long emphasised in machine learning [Nixon et al., 2019, introduce an optimality criterion for the binning Minderer et al., 2021] and in probabilistic forecasting more procedure of calibration error metrics based on a broadly [Dawid, 1982, Degroot and Fienberg, 1983].

data mining, machine learning, tce, (15 more...)

arXiv.org Artificial Intelligence

Jun-25-2023

arXiv.org PDF

Add feedback

Country:
- Europe (0.28)
- North America > United States
  - California (0.14)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine (1.00)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning
      - Learning Graphical Models > Directed Networks
        Bayesian Learning (0.68)
      - Neural Networks (1.00)
      - Performance Analysis > Accuracy (0.46)
      - Statistical Learning (1.00)
    - Representation & Reasoning > Uncertainty
      - Bayesian Inference (0.46)
  - Data Science > Data Mining (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found