to

AAAI96-194.pdf

Cox's well-known theorem justifying the use of probability is shown not to hold in finite domains. The counterexample also suggests that Cox's assumptions are insufficient to prove the result even in infinite domains. The same counterexample is used to disprove a result of Fine on comparative conditional probability.

FLAIRS01-115.pdf

Halpern has retracted an earlier claim that Cox's Theorem is deductively unsound, but he has renewed and amplified his objections to the reasonableness of the theorem for finite domains. His new argument highlights one functional equation used by Cox in 1946 but which is missing in 1978. The circumstances of its disappearance are explored, along with some of the advances in knowledge since 1946 which account for its absence. Introduction Cox's Theorem (1946, 1961, 1978) is a well-known foundational result for subjective probabilities. The theorem concerns the existence of real-valued functions which, when applied to suitable measures, yield ordinary probabilities which obey the usual additivity and product rules. Cox proves the theorem without reliance on any frequentist notions. Since the measures and functions in question exist under what many take to be mild assumptions, the theorem is interpreted as a normative motivation of belief models which feature probability.

PAC learnability under non-atomic measures: a problem by Vidyasagar

In response to a 1997 problem of M. Vidyasagar, we state a criterion for PAC learnability of a concept class $\mathscr C$ under the family of all non-atomic (diffuse) measures on the domain $\Omega$. The uniform Glivenko--Cantelli property with respect to non-atomic measures is no longer a necessary condition, and consistent learnability cannot in general be expected. Our criterion is stated in terms of a combinatorial parameter $\VC({\mathscr C}\,{\mathrm{mod}}\,\omega_1)$ which we call the VC dimension of $\mathscr C$ modulo countable sets. The new parameter is obtained by "thickening up" single points in the definition of VC dimension to uncountable "clusters". Equivalently, $\VC(\mathscr C\modd\omega_1)\leq d$ if and only if every countable subclass of $\mathscr C$ has VC dimension $\leq d$ outside a countable subset of $\Omega$. The new parameter can be also expressed as the classical VC dimension of $\mathscr C$ calculated on a suitable subset of a compactification of $\Omega$. We do not make any measurability assumptions on $\mathscr C$, assuming instead the validity of Martin's Axiom (MA). Similar results are obtained for function learning in terms of fat-shattering dimension modulo countable sets, but, just like in the classical distribution-free case, the finiteness of this parameter is sufficient but not necessary for PAC learnability under non-atomic measures.

Foundations of Inference

We present a simple and clear foundation for finite inference that unites and significantly extends the approaches of Kolmogorov and Cox. Our approach is based on quantifying lattices of logical statements in a way that satisfies general lattice symmetries. With other applications such as measure theory in mind, our derivations assume minimal symmetries, relying on neither negation nor continuity nor differentiability. Each relevant symmetry corresponds to an axiom of quantification, and these axioms are used to derive a unique set of quantifying rules that form the familiar probability calculus. We also derive a unique quantification of divergence, entropy and information.

Updating Sets of Probabilities

There are several well-known justifications for conditioning as the appropriate method for updating a single probability measure, given an observation. However, there is a significant body of work arguing for sets of probability measures, rather than single measures, as a more realistic model of uncertainty. Conditioning still makes sense in this context--we can simply condition each measure in the set individually, then combine the results--and, indeed, it seems to be the preferred updating procedure in the literature. But how justified is conditioning in this richer setting? Here we show, by considering an axiomatic account of conditioning given by van Fraassen, that the single-measure and sets-of-measures cases are very different. We show that van Fraassen's axiomatization for the former case is nowhere near sufficient for updating sets of measures. We give a considerably longer (and not as compelling) list of axioms that together force conditioning in this setting, and describe other update methods that are allowed once any of these axioms is dropped.