AITopics | multiple best arm

Collaborating Authors

multiple best arm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On Regret with Multiple Best Arms

Neural Information Processing SystemsDec-24-2025, 03:09:09 GMT

We study a regret minimization problem with the existence of multiple best/near-optimal arms in the multi-armed bandit setting. We consider the case when the number of arms/actions is comparable or much larger than the time horizon, and make no assumptions about the structure of the bandit instance. Our goal is to design algorithms that can automatically adapt to the unknown hardness of the problem, i.e., the number of best arms. Our setting captures many modern applications of bandit algorithms where the action space is enormous and the information about the underlying instance/structure is unavailable. We first propose an adaptive algorithm that is agnostic to the hardness level and theoretically derive its regret bound. We then prove a lower bound for our problem setting, which indicates: (1) no algorithm can be minimax optimal simultaneously over all hardness levels; and (2) our algorithm achieves a rate function that is Pareto optimal. With additional knowledge of the expected reward of the best arm, we propose another adaptive algorithm that is minimax optimal, up to polylog factors, over all hardness levels. Experimental results confirm our theoretical guarantees and show advantages of our algorithms over the previous state-of-the-art.

algorithm, multiple best arm, name change, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.60)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Review for NeurIPS paper: On Regret with Multiple Best Arms

Neural Information Processing SystemsJan-25-2025, 05:01:23 GMT

Summary and Contributions: The authors consider a stochastic bandit setting with n arms, where n is taken to be very large. Unlike other large-armed bandit work, there is no structure assumed between the payoff of the various arms. Their methods rely heavily on MOSS [Audibert et al.], which attains an optimal regret rate of \sqrt{nT} in the usual setting where n T. Letting m n denote the number of arms that are optimal (or near optimal), the authors begin by defining alpha satisfying n/m T {\alpha} as a key quantity characterizing the hardness. They observe that if alpha were known, one could sample a subset of the arms of cardinality T alpha log T, which contains an optimal arm whp. Applying MOSS attains a regret bound of O(T (1 alpha)/2).

algorithm, multiple best arm, neurips paper, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.40)

Add feedback

On Regret with Multiple Best Arms

Neural Information Processing SystemsOct-10-2024, 10:06:18 GMT

algorithm, hardness, multiple best arm, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.63)
Information Technology > Artificial Intelligence > Machine Learning (0.43)

Add feedback