Reviews: Adaptive Online Learning in Dynamic Environments
–Neural Information Processing Systems
This paper studies online convex optimization in dynamic environments, where one wishes to control the regret with respect to sequences of comparators, as opposed to a static comparator. The complexity of a comparator sequence is measured by its path length P_T, and the goal is to obtain an optimal regret bound simultaneously for all path lengths. A first bound in this setting was established in the pioneering paper [1], which showed that online gradient descent (OGD, projected on the convex compact domain) with step-size 1/sqrt(T) achieves an at most T {1/2} (1 P_T) regret for all comparator sequences. However, there is a gap between this upper bound and the (T (1 P_T)) {1/2} lower bound on worst-case regret established in Theorem 2 of the present paper, which is the first lower bound for this problem. On the other hand, if the path length P_T one wishes to compare against is known in advance, optimally tuning the step-size of OGD with respect to P_T in the OGD regret bound yields optimal (T (1 P_T)) {1/2} regret for this path length.
Neural Information Processing Systems
Oct-7-2024, 05:10:39 GMT