Labeled sample compression schemes for complexes of oriented matroids

Chepoi, Victor, Knauer, Kolja, Philibert, Manon

arXiv.org Artificial Intelligence 

Littlestone and Warmuth [51] introduced sample compression schemes as an abstraction of the underlying structure of learning algorithms. Roughly, the aim of a sample compression scheme is to compress samples of a concept class (i.e., of a set system) C as much as possible, such that data coherent with the original samples can be reconstructed from the compressed data. There are two types of sample compression schemes: labeled, see [35, 51] and unlabeled, see [7, 34, 49]. A labeled compression scheme of size k compresses every sample of C to a labeled subsample of size at most k and an unlabeled compression scheme of size k compresses every sample of C to a subset of size at most k of the domain of the sample (see the end of the introduction for precise definitions). The Vapnik-Chervonenkis dimension (VC-dimension) of a set system, was introduced by [69] as a complexity measure of set systems. VC-dimension is central in PAC-learning and plays an important role in combinatorics, algorithmics, discrete geometry, and combinatorial optimization. In particular, it coincides with the rank in the theory of (complexes of) oriented matroids. Furthermore, within machine learning and closely tied to the topic of this paper, the sample compression conjecture of [35] and [51] states that any set system of VC-dimension d has a labeled sample compression scheme of size O(d). This question remains one of the oldest open problems in computational learning theory.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found