Reasoning About Generalization via Conditional Mutual Information

Steinke, Thomas, Zakynthinou, Lydia

arXiv.org Machine Learning 

How can we ensure that a machine learning system produces an o utput that generalizes to the underlying distribution, rather than overfitting its train ing data? That is, how can we ensure that the hypotheses or models that are produced are reflective of t he underlying population the training data was drawn from, rather than patterns that occur only by c hance in the training data? This is perhaps the fundamental question for the science of statist ical machine learning. A vast array of methods have been proposed to answer this ques tion. Most notably, the theory of uniform convergence shows that, if the output is sufficiently "simple," then it cannot overfit too much. A more recent line of work has used distributional stability (in the form of differential privacy) to provide generalization guarantees that compose adaptivel y - that is, statistical validity is preserved even when a dataset is reused multiple times with each succes sive analysis being influenced by the outcomes of prior analyses. Other methods for proving gener alization include compression schemes and uniform stability. Unfortunately, these different methods for providing gener alization guarantees are largely disconnected from one another; it is, in general, not possible t o compare or combine techniques. In this paper, we provide a framework to reason about many of the se these differing approaches using the unifying language of information theory.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found