Beyond Minimax Rates in Group Distributionally Robust Optimization via a Novel Notion of Sparsity
Nguyen, Quan, Mehta, Nishant A., Guzmán, Cristóbal
–arXiv.org Artificial Intelligence
The minimax sample complexity of group distributionally robust optimization (GDRO) has been determined up to a $\log(K)$ factor, for $K$ the number of groups. In this work, we venture beyond the minimax perspective via a novel notion of sparsity that we dub $(\lambda, \beta)$-sparsity. In short, this condition means that at any parameter $\theta$, there is a set of at most $\beta$ groups whose risks at $\theta$ all are at least $\lambda$ larger than the risks of the other groups. To find an $\epsilon$-optimal $\theta$, we show via a novel algorithm and analysis that the $\epsilon$-dependent term in the sample complexity can swap a linear dependence on $K$ for a linear dependence on the potentially much smaller $\beta$. This improvement leverages recent progress in sleeping bandits, showing a fundamental connection between the two-player zero-sum game optimization framework for GDRO and per-action regret bounds in sleeping bandits. The aforementioned result assumes having a particular $\lambda$ as input. Perhaps surprisingly, we next show an adaptive algorithm which, up to log factors, gets sample complexity that adapts to the best $(\lambda, \beta)$-sparsity condition that holds. Finally, for a particular input $\lambda$, we also show how to get a dimension-free sample complexity result.
arXiv.org Artificial Intelligence
Oct-1-2024
- Country:
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America
- Canada > Ontario
- Toronto (0.04)
- United States (0.04)
- Canada > Ontario
- South America > Chile (0.04)
- Europe > United Kingdom
- Genre:
- Research Report (0.50)
- Technology: