calibrator
Calibrating conditional risk
Vasilyev, Andrey, Wang, Yikai, Li, Xiaocheng, Chen, Guanting
We introduce and study the problem of calibrating conditional risk, which involves estimating the expected loss of a prediction model conditional on input features. We analyze this problem in both classification and regression settings and show that it is fundamentally equivalent to a standard regression task. For classification settings, we further establish a connection between conditional risk calibration and individual/conditional probability calibration, and develop theoretical insights for the performance metric. This reveals that while conditional risk calibration is related to existing uncertainty quantification problems, it remains a distinct and standalone machine learning problem. Empirically, we validate our theoretical findings and demonstrate the practical implications of conditional risk calibration in the learning to defer (L2D) framework. Our systematic experiments provide both qualitative and quantitative assessments, offering guidance for future research in uncertainty-aware decision-making.
Decomposing Probabilistic Scores: Reliability, Information Loss and Uncertainty
Charpentier, Arthur, Machado, Agathe Fernandes
Calibration is a conditional property that depends on the information retained by a predictor. We develop decomposition identities for arbitrary proper losses that make this dependence explicit. At any information level $\mathcal A$, the expected loss of an $\mathcal A$-measurable predictor splits into a proper-regret (reliability) term and a conditional entropy (residual uncertainty) term. For nested levels $\mathcal A\subseteq\mathcal B$, a chain decomposition quantifies the information gain from $\mathcal A$ to $\mathcal B$. Applied to classification with features $\boldsymbol{X}$ and score $S=s(\boldsymbol{X})$, this yields a three-term identity: miscalibration, a {\em grouping} term measuring information loss from $\boldsymbol{X}$ to $S$, and irreducible uncertainty at the feature level. We leverage the framework to analyze post-hoc recalibration, aggregation of calibrated models, and stagewise/boosting constructions, with explicit forms for Brier and log-loss.
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York (0.04)
- North America > Canada (0.04)
- Asia > Japan (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Diminishing Returns Shape Constraints for Interpretability and Regularization
Maya Gupta, Dara Bahri, Andrew Cotter, Kevin Canini
Similarly, a model that predicts the time it will take a customer to grocery shop should decrease in the number of cashiers, but each addedcashierreduces average wait time by less. In both cases, we would like to be able to incorporate this prior knowledge by constraining the machine learned model's output to have a diminishing returns response to the size of the apartment or number of cashiers.
- North America > United States > New York (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > Canada > Quebec > Montreal (0.04)
TheUnreliabilityofExplanationsinFew-shot PromptingforTextualReasoning
However, text-davinci-002 is able to benefit more substantially. We further show that explanations generated by the LLMs may not entail the models' predictions norbefactually grounded intheinput, evenonsimple tasks with extractive explanations. However, these flawed explanations can still be useful as a way to verify LLMs' predictions post-hoc.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Louisiana (0.04)
- North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
CalibrationofSharedEquilibriainGeneralSum PartiallyObservableMarkovGames
We consider a general sum partially observableMarkovgamewhere agents ofdifferent types share asingle policy network, conditioned on agent-specific information. This paper aims at i) formally understanding equilibria reached by such agents, and ii) matching emergent phenomena ofsuch equilibria toreal-worldtargets. Parameter sharing with decentralized execution has been introduced as an efficient way to train multiple agents using a single policy network.
- North America > United States > California (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)