Optimal and Provable Calibration in High-Dimensional Binary Classification: Angular Calibration and Platt Scaling

Feb-20-2025–arXiv.org Machine Learning

EDU Department of Statistics, Harvard University 1 Oxford Street, Cambridge, MA Abstract We study the fundamental problem of calibrating a linear binary classifier of the form σ ( ˆw x), where the feature vector x is Gaussian, σ is a link function, and ˆ w is an estimator of the true linear weight w . By interpolating with a noninformative chance classifier, we construct a well-calibrated predictor whose interpolation weight depends on the angle ( ˆw,w) between the estimator ˆ w and the true linear weight w . We establish that this angular calibration approach is provably well-calibrated in a high-dimensional regime where the number of samples and features both diverge, at a comparable rate. The angle ( ˆw,w) can be consistently estimated. Furthermore, the resulting predictor is uniquely Bregman-optimal, minimizing the Bregman divergence to the true label distribution within a suitable class of calibrated predictors. Our work is the first to provide a calibration strategy that satisfies both calibration and optimality properties provably in high dimensions. Additionally, we identify conditions under which a classical Platt-scaling predictor converges to our Bregman-optimal calibrated solution. Thus, Platt-scaling also inherits these desirable properties provably in high dimensions. Keywords: Calibration; Binary Classification; High Dimensions; Bregman Divergence 1. Introduction Calibration of predictive models is a fundamental problem in statistics and machine learning, especially in applications that require reliable uncertainty quantification.

artificial intelligence, machine learning, predictor, (15 more...)

arXiv.org Machine Learning

Feb-20-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.24)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found