minimax statistical learning
- North America > United States > Illinois > Champaign County > Urbana (0.14)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
Minimax Statistical Learning with Wasserstein distances
As opposed to standard empirical risk minimization (ERM), distributionally robust optimization aims to minimize the worst-case risk over a larger ambiguity set containing the original empirical distribution of the training data. In this work, we describe a minimax framework for statistical learning with ambiguity sets given by balls in Wasserstein space. In particular, we prove generalization bounds that involve the covering number properties of the original ERM problem. As an illustrative example, we provide generalization guarantees for transport-based domain adaptation problems where the Wasserstein distance between the source and target domain distributions can be reliably estimated from unlabeled samples.
Reviews: Minimax Statistical Learning with Wasserstein distances
The paper investigates a minimax framework for statistical learning where the goal is to minimize the worst-case population risk over a family of distributions that are within a prescribed Wasserstein distance from the unknown data-generating distribution. The authors develop data-dependent generalization bound and data-independent excess risk bounds (using smoothness assumptions) in the setting where the classical empirical risk minimization (ERM) algorithm is replaced by a robust procedure that minimizes the worst-case empirical risk with respect to distributions contained in a Wasserstein ball centered around the data-generating empirical distribution. The statistical minimax framework investigated by the authors resembles in spirit the one introduced in [9], where the ambiguity set is defined via moment constraints instead of the Wasserstein distance. The paper is well-written, with accurate references to previous literature and an extensive use of remarks to guide the development of the theory. The contributions are clearly emphasized, and the math is solid.
Minimax Statistical Learning with Wasserstein distances
As opposed to standard empirical risk minimization (ERM), distributionally robust optimization aims to minimize the worst-case risk over a larger ambiguity set containing the original empirical distribution of the training data. In this work, we describe a minimax framework for statistical learning with ambiguity sets given by balls in Wasserstein space. In particular, we prove generalization bounds that involve the covering number properties of the original ERM problem. As an illustrative example, we provide generalization guarantees for transport-based domain adaptation problems where the Wasserstein distance between the source and target domain distributions can be reliably estimated from unlabeled samples. Papers published at the Neural Information Processing Systems Conference.