Goto

Collaborating Authors

 per-action regret


Hardness of Online Sleeping Combinatorial Optimization Problems

Satyen Kale, Chansoo Lee, David Pal

Neural Information Processing Systems

We show that several online combinatorial optimization problems that admit efficient no-regret algorithms become computationally hard in the sleeping setting where a subset of actions becomes unavailable in each round. Specifically, we show that the sleeping versions of these problems are at least as hard as P AC learning DNF expressions, a long standing open problem.


Reviewer 1 4 Comment: With more space the authors might present more discussion of past/related work

Neural Information Processing Systems

We would like to thank the reviewers for their positive and constructive comments. Below we respond to each of your comments. Response: Thanks, we will expand our discussion of related work, in particular including references [2]-[4] below. Comment: It would be interesting to know if the approach of [1] works here and gives similar results. The notion of regret in the "Prediction with specialist experts' advice" section of [1] (this is the relevant Why do we need to specify the "first" alive expert, rather than the alive expert with the optimal performance?


Reviews: Hardness of Online Sleeping Combinatorial Optimization Problems

Neural Information Processing Systems

The paper solves an open problem presented at COLT 2015 [Koolen et. It defines OSCO problems as online combinatorial optimization problems in sleeping setting in which in each trial a subset of actions become unavailable due to unavailability of a subset of elements. Basically, the paper shows that OSCO problems are as hard as PAC-Learning of DNF expressions -- which is a long standing open problem. In a nutshell, the hardness result is shown as follows: the paper reduces the problem of "Online Agnostic Learnign of Disjuction" into "Hard Instances" of OSCO problems with per-action regret via a proof that has similarities to a proof in [Kanade and Steinke, 2014]. Interestingly enough, because of the online-to-batch conversion argument in [Kanade and Steinke, 2014], the hardness results seem to be also true for a benign form of adversary namely stochastic availabilities and loss (i.e. they are drawn from an unknown but fixed joint distribution). After proving the general hardness results, the paper provides "Hard Instances" of various well-known OSCO problems to establish their hardness in particularv -- including Online Sabotaged Shortest Path.


Beyond Minimax Rates in Group Distributionally Robust Optimization via a Novel Notion of Sparsity

Nguyen, Quan, Mehta, Nishant A., Guzmán, Cristóbal

arXiv.org Artificial Intelligence

The minimax sample complexity of group distributionally robust optimization (GDRO) has been determined up to a $\log(K)$ factor, for $K$ the number of groups. In this work, we venture beyond the minimax perspective via a novel notion of sparsity that we dub $(\lambda, \beta)$-sparsity. In short, this condition means that at any parameter $\theta$, there is a set of at most $\beta$ groups whose risks at $\theta$ all are at least $\lambda$ larger than the risks of the other groups. To find an $\epsilon$-optimal $\theta$, we show via a novel algorithm and analysis that the $\epsilon$-dependent term in the sample complexity can swap a linear dependence on $K$ for a linear dependence on the potentially much smaller $\beta$. This improvement leverages recent progress in sleeping bandits, showing a fundamental connection between the two-player zero-sum game optimization framework for GDRO and per-action regret bounds in sleeping bandits. The aforementioned result assumes having a particular $\lambda$ as input. Perhaps surprisingly, we next show an adaptive algorithm which, up to log factors, gets sample complexity that adapts to the best $(\lambda, \beta)$-sparsity condition that holds. Finally, for a particular input $\lambda$, we also show how to get a dimension-free sample complexity result.


Hardness of Online Sleeping Combinatorial Optimization Problems Chansoo Lee

Neural Information Processing Systems

We show that several online combinatorial optimization problems that admit efficient no-regret algorithms become computationally hard in the sleeping setting where a subset of actions becomes unavailable in each round. Specifically, we show that the sleeping versions of these problems are at least as hard as PAC learning DNF expressions, a long standing open problem.


Near-optimal Per-Action Regret Bounds for Sleeping Bandits

Nguyen, Quan, Mehta, Nishant A.

arXiv.org Machine Learning

We derive near-optimal per-action regret bounds for sleeping bandits, in which both the sets of available arms and their losses in every round are chosen by an adversary. In a setting with $K$ total arms and at most $A$ available arms in each round over $T$ rounds, the best known upper bound is $O(K\sqrt{TA\ln{K}})$, obtained indirectly via minimizing internal sleeping regrets. Compared to the minimax $\Omega(\sqrt{TA})$ lower bound, this upper bound contains an extra multiplicative factor of $K\ln{K}$. We address this gap by directly minimizing the per-action regret using generalized versions of EXP3, EXP3-IX and FTRL with Tsallis entropy, thereby obtaining near-optimal bounds of order $O(\sqrt{TA\ln{K}})$ and $O(\sqrt{T\sqrt{AK}})$. We extend our results to the setting of bandits with advice from sleeping experts, generalizing EXP4 along the way. This leads to new proofs for a number of existing adaptive and tracking regret bounds for standard non-sleeping bandits. Extending our results to the bandit version of experts that report their confidences leads to new bounds for the confidence regret that depends primarily on the sum of experts' confidences. We prove a lower bound, showing that for any minimax optimal algorithms, there exists an action whose regret is sublinear in $T$ but linear in the number of its active rounds.


Hardness of Online Sleeping Combinatorial Optimization Problems

Kale, Satyen, Lee, Chansoo, Pal, David

Neural Information Processing Systems

We show that several online combinatorial optimization problems that admit efficient no-regret algorithms become computationally hard in the sleeping setting where a subset of actions becomes unavailable in each round. Specifically, we show that the sleeping versions of these problems are at least as hard as PAC learning DNF expressions, a long standing open problem. We show hardness for the sleeping versions of Online Shortest Paths, Online Minimum Spanning Tree, Online k-Subsets, Online k-Truncated Permutations, Online Minimum Cut, and Online Bipartite Matching. The hardness result for the sleeping version of the Online Shortest Paths problem resolves an open problem presented at COLT 2015 [Koolen et al., 2015].