AITopics | hedge algorithm

Collaborating Authors

hedge algorithm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On Optimal Robustness to Adversarial Corruption in Online Decision Problems

Neural Information Processing SystemsApr-25-2026, 13:15:33 GMT

This paper considers two fundamental sequential decision-making problems: the problem of prediction with expert advice and the multi-armed bandit problem. We focus on stochastic regimes in which an adversary may corrupt losses, and we investigate what level of robustness can be achieved against adversarial corruption. The main contribution of this paper is to show that optimal robustness can be expressed by a square-root dependency on the amount of corruption.

artificial intelligence, data mining, machine learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

e9bf14a419d77534105016f5ec122d62-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 22:55:57 GMT

Therefore, if ν() < +, then we can bound(10) with eαν(). To avoid crowded notations, we drop the conditioning onz from Pr[ |ρ = z]. The issue is how to proceed. Let φ be the standard normal density function andΦ be the CDF. The algorithm using SVT suchthat itonly releases the private answerstothe queries if the answer is sufficiently different from the "guess".

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

db346ccb62d491029b590bbbf0f5c412-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 17:31:44 GMT

algorithm, optimistic hedge, swap regret, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Tight Regret Upper and Lower Bounds for Optimistic Hedge in Two-Player Zero-Sum Games

Tsuchiya, Taira

arXiv.org Machine LearningOct-14-2025

In two-player zero-sum games, the learning dynamic based on optimistic Hedge achieves one of the best-known regret upper bounds among strongly-uncoupled learning dynamics. With an appropriately chosen learning rate, the social and individual regrets can be bounded by $O(\log(mn))$ in terms of the numbers of actions $m$ and $n$ of the two players. This study investigates the optimality of the dependence on $m$ and $n$ in the regret of optimistic Hedge. To this end, we begin by refining existing regret analysis and show that, in the strongly-uncoupled setting where the opponent's number of actions is known, both the social and individual regret bounds can be improved to $O(\sqrt{\log m \log n})$. In this analysis, we express the regret upper bound as an optimization problem with respect to the learning rates and the coefficients of certain negative terms, enabling refined analysis of the leading constants. We then show that the existing social regret bound as well as these new social and individual regret upper bounds cannot be further improved for optimistic Hedge by providing algorithm-dependent individual regret lower bounds. Importantly, these social regret upper and lower bounds match exactly including the constant factor in the leading term. Finally, building on these results, we improve the last-iterate convergence rate and the dynamic regret of a learning dynamic based on optimistic Hedge, and complement these bounds with algorithm-dependent dynamic regret lower bounds that match the improved bounds.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Machine Learning

2510.11691

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)

Add feedback

db346ccb62d491029b590bbbf0f5c412-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 19:40:02 GMT

algorithm, optimistic hedge, swap regret, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Beyond $\tilde{O}(\sqrt{T})$ Constraint Violation for Online Convex Optimization with Adversarial Constraints

Sinha, Abhishek, Vaze, Rahul

arXiv.org Machine LearningMay-13-2025

We revisit the Online Convex Optimization problem with adversarial constraints (COCO) where, in each round, a learner is presented with a convex cost function and a convex constraint function, both of which may be chosen adversarially. The learner selects actions from a convex decision set in an online fashion, with the goal of minimizing both regret and the cumulative constraint violation (CCV) over a horizon of $T$ rounds. The best-known policy for this problem achieves $O(\sqrt{T})$ regret and $\tilde{O}(\sqrt{T})$ CCV. In this paper, we present a surprising improvement that achieves a significantly smaller CCV by trading it off with regret. Specifically, for any bounded convex cost and constraint functions, we propose an online policy that achieves $\tilde{O}(\sqrt{dT}+ T^β)$ regret and $\tilde{O}(dT^{1-β})$ CCV, where $d$ is the dimension of the decision set and $β\in [0,1]$ is a tunable parameter. We achieve this result by first considering the special case of $\textsf{Constrained Expert}$ problem where the decision set is a probability simplex and the cost and constraint functions are linear. Leveraging a new adaptive small-loss regret bound, we propose an efficient policy for the $\textsf{Constrained Expert}$ problem, that attains $O(\sqrt{T\ln N}+T^β)$ regret and $\tilde{O}(T^{1-β} \ln N)$ CCV, where $N$ is the number of experts. The original problem is then reduced to the $\textsf{Constrained Expert}$ problem via a covering argument. Finally, with an additional smoothness assumption, we propose an efficient gradient-based policy attaining $O(T^{\max(\frac{1}{2},β)})$ regret and $\tilde{O}(T^{1-β})$ CCV.

artificial intelligence, constraint violation, machine learning, (16 more...)

arXiv.org Machine Learning

2505.06709

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > India > Maharashtra > Mumbai (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.87)

Add feedback

HyperArm Bandit Optimization: A Novel approach to Hyperparameter Optimization and an Analysis of Bandit Algorithms in Stochastic and Adversarial Settings

Karroum, Samih, Mazhar, Saad

arXiv.org Artificial IntelligenceMar-13-2025

This paper explores the application of bandit algorithms in both stochastic and adversarial settings, with a focus on theoretical analysis and practical applications. The study begins by introducing bandit problems, distinguishing between stochastic and adversarial variants, and examining key algorithms such as Explore-Then-Commit (ETC), Upper Confidence Bound (UCB), and Exponential-Weight Algorithm for Exploration and Exploitation (EXP3). Theoretical regret bounds are analyzed to compare the performance of these algorithms. The paper then introduces a novel framework, HyperArm Bandit Optimization (HABO), which applies EXP3 to hyperparameter tuning in machine learning models. Unlike traditional methods that treat entire configurations as arms, HABO treats individual hyperparameters as super-arms, and its potential configurations as sub-arms, enabling dynamic resource allocation and efficient exploration. Experimental results demonstrate HABO's effectiveness in classification and regression tasks, outperforming Bayesian Optimization in terms of computational efficiency and accuracy. The paper concludes with insights into the convergence guarantees of HABO and its potential for scalable and robust hyperparameter optimization.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2503.10282

Genre:

Research Report > Promising Solution (0.50)
Overview > Innovation (0.50)

Industry: Energy > Oil & Gas > Upstream (0.49)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Online Decision-Making in General Combinatorial Spaces

Arun Rajkumar, Shivani Agarwal

Neural Information Processing SystemsFeb-10-2025, 00:29:01 GMT

We study online combinatorial decision problems, where one must make sequential decisions in some combinatorial space without knowing in advance the cost of decisions on each trial; the goal is to minimize the total regret over some sequence of trials relative to the best fixed decision in hindsight. Such problems have been studied mostly in settings where decisions are represented by Boolean vectors and costs are linear in this representation. Here we study a general setting where costs may be linear in any suitable low-dimensional vector representation of elements of the decision space. We give a general algorithm for such problems that we call low-dimensional online mirror descent (LDOMD); the algorithm generalizes both the Component Hedge algorithm of Koolen et al. (2010), and a recent algorithm of Suehiro et al. (2012). Our study offers a unification and generalization of previous work, and emphasizes the role of the convex polytope arising from the vector representation of the decision space; while Boolean representations lead to 0-1 polytopes, more general vector representations lead to more general polytopes. We study several examples of both types of polytopes. Finally, we demonstrate the benefit of having a general framework for such problems via an application to an online transportation problem; the associated transportation polytopes generalize the Birkhoff polytope of doubly stochastic matrices, and the resulting algorithm generalizes the PermELearn algorithm of Helmbold and Warmuth (2009).

algorithm, artificial intelligence, polytope, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Instructional Material (0.46)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

A Drifting-Games Analysis for Online Learning and Applications to Boosting

Haipeng Luo, Robert E. Schapire

Neural Information Processing SystemsFeb-9-2025, 20:26:00 GMT

We provide a general mechanism to design online learning algorithms based on a minimax analysis within a drifting-games framework. Different online learning settings (Hedge, multi-armed bandit problems and online convex optimization) are studied by converting into various kinds of drifting games. The original minimax analysis for drifting games is then used and generalized by applying a series of relaxations, starting from choosing a convex surrogate of the 0-1 loss function. With different choices of surrogates, we not only recover existing algorithms, but also propose new algorithms that are totally parameter-free and enjoy other useful properties. Moreover, our drifting-games framework naturally allows us to study high probability bounds without resorting to any concentration results, and also a generalized notion of regret that measures how good the algorithm is compared to all but the top small fraction of candidates. Finally, we translate our new Hedge algorithm into a new adaptive boosting algorithm that is computationally faster as shown in experiments, since it ignores a large number of examples on each round.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Education > Educational Setting > Online (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.92)
Information Technology > Data Science > Data Mining > Big Data (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.58)

Add feedback

Adaptive Hedge

Neural Information Processing SystemsMar-15-2024, 02:59:11 GMT

Most methods for decision-theoretic online learning are based on the Hedge algorithm, which takes a parameter called the learning rate. In most previous analyses the learning rate was carefully tuned to obtain optimal worst-case performance, leading to suboptimal performance on easy instances, for example when there exists an action that is significantly better than all others. We propose a new way of setting the learning rate, which adapts to the difficulty of the learning problem: in the worst case our procedure still guarantees optimal performance, but on easy instances it achieves much smaller regret. In particular, our adaptive method achieves constant regret in a probabilistic setting, when there exists an action that on average obtains strictly smaller loss than all other actions. We also provide a simulation study comparing our approach to existing methods.

adahedge, algorithm, hedge, (17 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Holland > Amsterdam (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Education > Educational Setting > Online (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.34)

Add feedback