Distributionally Robust Losses for Latent Covariate Mixtures
Duchi, John, Hashimoto, Tatsunori, Namkoong, Hongseok
While modern large-scale datasets often consist of heterogeneous subpopulations---for example, multiple demographic groups or multiple text corpora---the standard practice of minimizing average loss fails to guarantee uniformly low losses across all subpopulations. We propose a convex procedure that controls the worst-case performance over all subpopulations of a given size. Our procedure comes with finite-sample (nonparametric) convergence guarantees on the worst-off subpopulation. Empirically, we observe on lexical similarity, wine quality, and recidivism prediction tasks that our worst-case procedure learns models that do well against unseen subpopulations.
Jul-28-2020
- Country:
- South America > Paraguay
- North America > United States
- New York (0.04)
- Massachusetts (0.04)
- California > Santa Clara County
- Palo Alto (0.04)
- Europe
- Western Europe (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Oxfordshire > Oxford (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Genre:
- Research Report
- New Finding (0.67)
- Experimental Study (0.45)
- Research Report
- Technology: