AITopics | multiple learning rate

Collaborating Authors

multiple learning rate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MetaGrad: Multiple Learning Rates in Online Learning

Neural Information Processing SystemsNov-21-2025, 14:18:24 GMT

In online convex optimization it is well known that certain subclasses of objective functions are much easier than arbitrary convex functions. We are interested in designing adaptive methods that can automatically get fast rates in as many such subclasses as possible, without any manual tuning. Previous adaptive methods are able to interpolate between strongly convex and general convex functions. We present a new method, MetaGrad, that adapts to a much broader class of functions, including exp-concave and strongly convex functions, but also various types of stochastic and non-stochastic functions without any curvature. For instance, MetaGrad can achieve logarithmic regret on the unregularized hinge loss, even though it has no curvature, if the data come from a favourable probability distribution. MetaGrad's main feature is that it simultaneously considers multiple learning rates. Unlike all previous methods with provable regret guarantees, however, its learning rates are not monotonically decreasing over time and are not tuned based on a theoretically derived bound on the regret. Instead, they are weighted directly proportional to their empirical performance on the data using a tilted exponential weights master algorithm.

metagrad, multiple learning rate, name change, (6 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (0.43)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.67)

Add feedback

Reviews: MetaGrad: Multiple Learning Rates in Online Learning

Neural Information Processing SystemsJan-20-2025, 06:41:12 GMT

Online convex minimization is a fundamental problem in learning theory with various applications. Here, we are given a sequence of convex functions f1,...,fT in an online fashion and we seek to predict the minimizer of the sum so as to minimize the regret. This is a very well studied question with various guarantees known even when only given gradient information about the functions fi. For general functions this leads to a regret of sqrt(T) and for some special cases, we know better guarantees: O(log T) for strongly convex functions, O(d log T) for exp-convex functions etc. However, the algorithms that achieve the latter better guarantees are quite different and in some cases need to know bounds on the specific parameters of the functions (e.g., strong-convexity of the fi's, although this can now be avoided).

algorithm, multiple learning rate, online learning, (4 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.95)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)

Add feedback

CRPS Learning

Berrisch, Jonathan, Ziel, Florian

arXiv.org Machine LearningFeb-1-2021

Combination and aggregation techniques can improve forecast accuracy substantially. This also holds for probabilistic forecasting methods where full predictive distributions are combined. There are several time-varying and adaptive weighting schemes like Bayesian model averaging (BMA). However, the performance of different forecasters may vary not only over time but also in parts of the distribution. So one may be more accurate in the center of the distributions, and other ones perform better in predicting the distribution's tails. Consequently, we introduce a new weighting procedure that considers both varying performance across time and the distribution. We discuss pointwise online aggregation algorithms that optimize with respect to the continuous ranked probability score (CRPS). After analyzing the theoretical properties of a fully adaptive Bernstein online aggregation (BOA) method, we introduce smoothing procedures for pointwise CRPS learning. The properties are confirmed and discussed using simulation studies. Additionally, we illustrate the performance in a forecasting study for carbon markets. In detail, we predict the distribution of European emission allowance prices.

algorithm, bayesian inference, renewable energy, (23 more...)

arXiv.org Machine Learning

2102.00968

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Energy > Oil & Gas (0.93)
Energy > Power Industry (0.67)
Banking & Finance > Trading (0.66)
Energy > Renewable > Solar (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

MetaGrad: Multiple Learning Rates in Online Learning

Erven, Tim van, Koolen, Wouter M.

Neural Information Processing SystemsFeb-14-2020, 14:28:33 GMT

metagrad, multiple learning rate, online learning, (4 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)

Add feedback

A Second-order Bound with Excess Losses

Gaillard, Pierre, Stoltz, Gilles, Van Erven, Tim

arXiv.org Machine LearningFeb-10-2014

We study online aggregation of the predictions of experts, and first show new second-order regret bounds in the standard setting, which are obtained via a version of the Prod algorithm (and also a version of the polynomially weighted average algorithm) with multiple learning rates. These bounds are in terms of excess losses, the differences between the instantaneous losses suffered by the algorithm and the ones of a given expert. We then demonstrate the interest of these bounds in the context of experts that report their confidences as a number in the interval[0, 1] using a generic reduction to the standard setting. We conclude by two other applications in the standard setting, which improve the known bounds in case of small excess losses and show a bounded regret against i.i.d.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1402.2044

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Hauts-de-Seine > Clamart (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback