A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of \Theta(T {2/3}) and its Application to Best-of-Both-Worlds

Neural Information Processing Systems 

Follow-the-Regularized-Leader (FTRL) is a powerful framework for various online learning problems. By designing its regularizer and learning rate to be adaptive to past observations, FTRL is known to work adaptively to various properties of an underlying environment. However, most existing adaptive learning rates are for online learning problems with a minimax regret of \Theta(\sqrt{T}) for the number of rounds T, and there are only a few studies on adaptive learning rates for problems with a minimax regret of \Theta(T {2/3}), which include several important problems dealing with indirect feedback. To address this limitation, we establish a new adaptive learning rate framework for problems with a minimax regret of \Theta(T {2/3}) . Our learning rate is designed by matching the stability, penalty, and bias terms that naturally appear in regret upper bounds for problems with a minimax regret of \Theta(T {2/3}) .