Self-calibrating Probability Forecasting
Vovk, Vladimir, Shafer, Glenn, Nouretdinov, Ilia
–Neural Information Processing Systems
In the problem of probability forecasting the learner's goal is to output, given a training set and a new object, a suitable probability measure on the possible values of the new object's label. An online algorithm for probability forecasting is said to be well-calibrated if the probabilities it outputs agree with the observed frequencies. We give a natural nonasymptotic formalization of the notion of well-calibratedness, which we then study under the assumption of randomness (the object/label pairs are independent and identically distributed). It turns out that, although no probability forecasting algorithm is automatically well-calibrated in our sense, there exists a wide class of algorithms for "multiprobability forecasting" (such algorithms are allowed to output a set, ideally very narrow, of probability measures) which satisfy this property; we call the algorithms in this class "Venn probability machines". Our experimental results demonstrate that a 1-Nearest Neighbor Venn probability machine performs reasonably well on a standard benchmark data set, and one of our theoretical results asserts that a simple Venn probability machine asymptotically approaches the true conditional probabilities regardless, and without knowledge, of the true probability measure generating the examples.
Neural Information Processing Systems
Dec-31-2004
- Country:
- North America > United States
- New York (0.05)
- New Jersey > Essex County
- Newark (0.04)
- California
- San Francisco County > San Francisco (0.14)
- Santa Cruz County > Santa Cruz (0.04)
- San Mateo County > Menlo Park (0.04)
- Europe
- United Kingdom (0.04)
- Netherlands > South Holland
- Dordrecht (0.04)
- North America > United States
- Genre:
- Research Report (0.35)