On Misspecification in Prediction Problems and Robustness via Improper Learning

Duchi, John, Marsden, Annie, Valiant, Gregory

arXiv.org Machine Learning 

Suppose we seek a probability distribution p(y x) modeling outcomes y given data x. The typical approach is to choose a parametric family of probability distributions, then find the "best" member of this family according to a given loss. It is rarely realistic to assume that the parametric family is well-specified, and thus it is important to understand the consequences of misspecification and how to circumvent these downsides. To address these challenges, in this paper we derive a new measure of a problem's robustness to misspecification that relies on the curvature of the loss at hand and putative parametric family, proving that this measure lower bounds convergence rates for prediction error and certifies the failure of a parametric family and loss to be robust (or achieve optimal convergence rates for prediction). To complement this new family of lower bounds for probabilistic prediction problems, we build out of earlier work on improper learning [40, 14]--when we may choose predictions p(y x) outside the given model family--to show how it is possible to be robust to such misspecification, and moreover, we give new optimality guarantees for such improper procedures. Formalizing our setting, we consider the following probabilistic game: a player receives a covariate vector x X, plays a distribution p(· x) on a target set Y, then receives y Y and suffers loss L(p(· x), y). We study both a sequential and a stochastic variant of this problem.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found