to

PAC learnability under non-atomic measures: a problem by Vidyasagar

In response to a 1997 problem of M. Vidyasagar, we state a criterion for PAC learnability of a concept class $\mathscr C$ under the family of all non-atomic (diffuse) measures on the domain $\Omega$. The uniform Glivenko--Cantelli property with respect to non-atomic measures is no longer a necessary condition, and consistent learnability cannot in general be expected. Our criterion is stated in terms of a combinatorial parameter $\VC({\mathscr C}\,{\mathrm{mod}}\,\omega_1)$ which we call the VC dimension of $\mathscr C$ modulo countable sets. The new parameter is obtained by "thickening up" single points in the definition of VC dimension to uncountable "clusters". Equivalently, $\VC(\mathscr C\modd\omega_1)\leq d$ if and only if every countable subclass of $\mathscr C$ has VC dimension $\leq d$ outside a countable subset of $\Omega$. The new parameter can be also expressed as the classical VC dimension of $\mathscr C$ calculated on a suitable subset of a compactification of $\Omega$. We do not make any measurability assumptions on $\mathscr C$, assuming instead the validity of Martin's Axiom (MA). Similar results are obtained for function learning in terms of fat-shattering dimension modulo countable sets, but, just like in the classical distribution-free case, the finiteness of this parameter is sufficient but not necessary for PAC learnability under non-atomic measures.

Updating Sets of Probabilities

There are several well-known justifications for conditioning as the appropriate method for updating a single probability measure, given an observation. However, there is a significant body of work arguing for sets of probability measures, rather than single measures, as a more realistic model of uncertainty. Conditioning still makes sense in this context--we can simply condition each measure in the set individually, then combine the results--and, indeed, it seems to be the preferred updating procedure in the literature. But how justified is conditioning in this richer setting? Here we show, by considering an axiomatic account of conditioning given by van Fraassen, that the single-measure and sets-of-measures cases are very different. We show that van Fraassen's axiomatization for the former case is nowhere near sufficient for updating sets of measures. We give a considerably longer (and not as compelling) list of axioms that together force conditioning in this setting, and describe other update methods that are allowed once any of these axioms is dropped.

Foundations of Inference

We present a simple and clear foundation for finite inference that unites and significantly extends the approaches of Kolmogorov and Cox. Our approach is based on quantifying lattices of logical statements in a way that satisfies general lattice symmetries. With other applications such as measure theory in mind, our derivations assume minimal symmetries, relying on neither negation nor continuity nor differentiability. Each relevant symmetry corresponds to an axiom of quantification, and these axioms are used to derive a unique set of quantifying rules that form the familiar probability calculus. We also derive a unique quantification of divergence, entropy and information.

Qualitative Decision Theory with Sugeno Integrals

This paper presents an axiomatic framework for qualitative decision under uncertainty in a finite setting. The corresponding utility is expressed by a sup-min expression, called Sugeno (or fuzzy) integral. Technically speaking, Sugeno integral is a median, which is indeed a qualitative counterpart to the averaging operation underlying expected utility. The axiomatic justification of Sugeno integral-based utility is expressed in terms of preference between acts as in Savage decision theory. Pessimistic and optimistic qualitative utilities, based on necessity and possibility measures, previously introduced by two of the authors, can be retrieved in this setting by adding appropriate axioms.

Representation Dependence in Probabilistic Inference

Non-deductive reasoning systems are often representation dependent: representing the same situation in two different ways may cause such a system to return two different answers. Some have viewed this as a significant problem. For example, the principle of maximum entropyhas been subjected to much criticism due to its representation dependence. There has, however, been almost no work investigating representation dependence. In this paper, we formalize this notion and show that it is not a problem specific to maximum entropy. In fact, we show that any representation-independent probabilistic inference procedure that ignores irrelevant information is essentially entailment, in a precise sense. Moreover, we show that representation independence is incompatible with even a weak default assumption of independence. We then show that invariance under a restricted class of representation changes can form a reasonable compromise between representation independence and other desiderata, and provide a construction of a family of inference procedures that provides such restricted representation independence, using relative entropy.