Reasoning About Generalization via Conditional Mutual Information

Jan-24-2020–arXiv.org Machine Learning

How can we ensure that a machine learning system produces an o utput that generalizes to the underlying distribution, rather than overfitting its train ing data? That is, how can we ensure that the hypotheses or models that are produced are reflective of t he underlying population the training data was drawn from, rather than patterns that occur only by c hance in the training data? This is perhaps the fundamental question for the science of statist ical machine learning. A vast array of methods have been proposed to answer this ques tion. Most notably, the theory of uniform convergence shows that, if the output is sufficiently "simple," then it cannot overfit too much. A more recent line of work has used distributional stability (in the form of differential privacy) to provide generalization guarantees that compose adaptivel y - that is, statistical validity is preserved even when a dataset is reused multiple times with each succes sive analysis being influenced by the outcomes of prior analyses. Other methods for proving gener alization include compression schemes and uniform stability. Unfortunately, these different methods for providing gener alization guarantees are largely disconnected from one another; it is, in general, not possible t o compare or combine techniques. In this paper, we provide a framework to reason about many of the se these differing approaches using the unifying language of information theory.

algorithm, cmi, mutual information, (12 more...)

arXiv.org Machine Learning

Jan-24-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County
    - New York City (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
- Europe > Spain
  - Andalusia > Cádiz Province > Cadiz (0.04)

Genre:
- Research Report (0.50)

Industry:
- Information Technology > Security & Privacy (0.67)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Computational Learning Theory (0.93)
  - Performance Analysis > Accuracy (0.93)
  - Statistical Learning (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found