switching-constrained online convex optimization
- North America > United States > California (0.14)
- North America > United States > Massachusetts (0.04)
- North America > Canada (0.04)
- (2 more...)
Minimax Regret of Switching-Constrained Online Convex Optimization: No Phase Transition
We study the problem of switching-constrained online convex optimization (OCO), where the player has a limited number of opportunities to change her action. While the discrete analog of this online learning task has been studied extensively, previous work in the continuous setting has neither established the minimax rate nor algorithmically achieved it. In this paper, we show that $ T $-round switching-constrained OCO with fewer than $ K $ switches has a minimax regret of $ \Theta(\frac{T}{\sqrt{K}}) $. In particular, it is at least $ \frac{T}{\sqrt{2K}} $ for one dimension and at least $ \frac{T}{\sqrt{K}} $ for higher dimensions. The lower bound in higher dimensions is attained by an orthogonal subspace argument.
- North America > United States > California (0.14)
- North America > United States > Massachusetts (0.04)
- North America > Canada (0.04)
- (2 more...)
Review for NeurIPS paper: Minimax Regret of Switching-Constrained Online Convex Optimization: No Phase Transition
Additional Feedback: Can you explain more what is the reason for the disappearance of the phase transition compared to the discrete game? In the discrete game without a switching bound, you would typically get a total cost of (1 epsilon)*OPT (log n)/epsilon, and then set epsilon sqrt((log n)/T) to balance the losses. This means that even without a switching bound, against an oblivious adversary, you're switching between actions only about 1/sqrt(T) of the time anyway. On the other hand, you are constantly modifying your probability distribution every round (just that this change is only occasionally reflected in a switch of actions). In your game, there isn't this distinction between the hidden probability distribution and the action played. Is that partly the difference?
Review for NeurIPS paper: Minimax Regret of Switching-Constrained Online Convex Optimization: No Phase Transition
The initial four reviews recommended accepting, based on the strength of the results and the interesting ideas. However, some issues were pointed out mainly related to the novelty value of the work and relation to several earlier results in the field. The authors in their reply provided an informative discussion on these issues. This satisfied the reviewers and strengthened their confidence for accepting the paper.
Minimax Regret of Switching-Constrained Online Convex Optimization: No Phase Transition
We study the problem of switching-constrained online convex optimization (OCO), where the player has a limited number of opportunities to change her action. While the discrete analog of this online learning task has been studied extensively, previous work in the continuous setting has neither established the minimax rate nor algorithmically achieved it. In this paper, we show that T -round switching-constrained OCO with fewer than K switches has a minimax regret of \Theta(\frac{T}{\sqrt{K}}) . The lower bound in higher dimensions is attained by an orthogonal subspace argument. In one dimension, a novel adversarial strategy yields the lower bound of O(\frac{T}{\sqrt{K}}), but a precise minimax analysis including constants is more involved.