Upper Bounds on the Generalization Error of Private Algorithms

Rodríguez-Gálvez, Borja, Bassi, Germán, Skoglund, Mikael

arXiv.org Machine Learning 

In this work, we study the generalization capability of algorithms from an information-theoretic perspective. It has been shown that the generalization error of an algorithm is bounded from above in terms of the mutual information between the algorithm's output hypothesis and the dataset with which it was trained. We build upon this fact and introduce a mathematical formulation to obtain upper bounds on this mutual information. We then develop a strategy using this formulation, based on the method of types and typicality, to find explicit upper bounds on the generalization error of smooth algorithms, i.e., algorithms that produce similar output hypotheses given similar input datasets. In particular, we show the bounds obtained with this strategy for the case of ɛ-DP and µ-GDP algorithms. A learning algorithm is a mechanism that takes a collection of data samples as an input and outputs a hypothesis. The usage of this type of algorithm spans from estimating the sinusoidal parameters of a received, noisy signal [1] to detecting and localizing a tumor from an MRI scan [2]. The generalization capability of a learning algorithm indicates its ability to perform similarly in new, unseen data, as it performed in the finite amount of data with which it was trained. Therefore, characterizing this capability allows us to evaluate the worth of an algorithm outside of the training data and, with a proper characterization framework, design robust algorithms.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found