metagrad
Online Inverse Linear Optimization: Efficient Logarithmic-Regret Algorithm, Robustness to Suboptimality, and Lower Bound
Shinsaku Sakaue[1], CyberAgent, Tokyo, Japan, shinsaku.sakaue@gmail.com, "3026 Taira Tsuchiya, The University of Tokyo and RIKEN AIP, Tokyo, Japan, tsuchiya@mist.i.u-tokyo.ac.jp, "3026 Han Bao[1], The Institute of Statistical Mathematics, Tokyo, Japan, bao.han@ism.ac.jp, "3026 Taihei Oki, Hokkaido University, Hokkaido, Japan, oki@icredd.hokudai.ac.jp
In online inverse linear optimization, a learner observes time-varying sets of feasible actions and an agent's optimal actions, selected by solving linear optimization over the feasible actions. The learner sequentially makes predictions of the agent's true linear objective function, and their quality is measured by the regret, the cumulative gap between optimal objective values and those achieved by following the learner's predictions. A seminal work by Bรคrmann et al. (2017) obtained a regret bound of O( T), where T is the time horizon. Subsequently, the regret bound has been improved to O(n4 lnT) by Besbes et al. (2021, 2025) and to O(nlnT) by Gollapudi et al. (2021), where nis the dimension of the ambient space of objective vectors. However, these logarithmic-regret methods are highly inefficient when T is large, as they need to maintain regions specified by O(T) constraints, which represent possible locations of the true objective vector.
MetaGrad: Multiple Learning Rates in Online Learning
Tim van Erven, Wouter M. Koolen
In online convex optimization it is well known that certain subclasses of objective functions are much easier than arbitrary convex functions. We are interested in designing adaptive methods that can automatically get fast rates in as many such subclasses as possible, without any manual tuning. Previous adaptive methods are able to interpolate between strongly convex and general convex functions. We present a new method, MetaGrad, that adapts to a much broader class of functions, including exp-concave and strongly convex functions, but also various types of stochastic and non-stochastic functions without any curvature. For instance, MetaGrad can achieve logarithmic regret on the unregularized hinge loss, even though it has no curvature, if the data come from a favourable probability distribution. MetaGrad's main feature is that it simultaneously considers multiple learning rates. Unlike previous methods with provable regret guarantees, however, its learning rates are not monotonically decreasing over time and are not tuned based on a theoretically derived bound on the regret. Instead, they are weighted directly proportional to their empirical performance on the data using a tilted exponential weights master algorithm.
Dual Adaptivity: A Universal Algorithm for Minimizing the Adaptive Regret of Convex Functions
To deal with changing environments, a new performance measure--adaptive regret, defined as the maximum static regret over any interval, was proposed in online learning. Under the setting of online convex optimization, several algorithms have been successfully developed to minimize the adaptive regret. However, existing algorithms lack universality in the sense that they can only handle one type of convex functions and need apriori knowledge of parameters. By contrast, there exist universal algorithms, such as MetaGrad, that attain optimal static regret for multiple types of convex functions simultaneously. Along this line of research, this paper presents the first universal algorithm for minimizing the adaptive regret of convex functions. Specifically, we borrow the idea of maintaining multiple learning rates in MetaGrad to handle the uncertainty of functions, and utilize the technique of sleeping experts to capture changing environments.
MetaGrad: Multiple Learning Rates in Online Learning
In online convex optimization it is well known that certain subclasses of objective functions are much easier than arbitrary convex functions. We are interested in designing adaptive methods that can automatically get fast rates in as many such subclasses as possible, without any manual tuning. Previous adaptive methods are able to interpolate between strongly convex and general convex functions. We present a new method, MetaGrad, that adapts to a much broader class of functions, including exp-concave and strongly convex functions, but also various types of stochastic and non-stochastic functions without any curvature. For instance, MetaGrad can achieve logarithmic regret on the unregularized hinge loss, even though it has no curvature, if the data come from a favourable probability distribution. MetaGrad's main feature is that it simultaneously considers multiple learning rates. Unlike all previous methods with provable regret guarantees, however, its learning rates are not monotonically decreasing over time and are not tuned based on a theoretically derived bound on the regret. Instead, they are weighted directly proportional to their empirical performance on the data using a tilted exponential weights master algorithm.