Optimal Estimation of the Best Mean in Multi-Armed Bandits

Jun-14-2026, 09:53:03 GMT–Neural Information Processing Systems

We study the problem of estimating the mean reward of the best arm in a multiarmed bandit (MAB) setting. Specifically, given a target precision εand confidence level 1 δ, the goal is to return an ε-accurate estimate of the largest mean reward with probability at least 1 δ, while minimizing the number of samples. We first establish an instance-dependent lower bound on the sample complexity, which requires handling the infinitely many possible candidates of the estimated best mean. This lower bound is expressed in a non-convex optimization problem, which becomes the main difficulty of this problem, preventing the direct application of standard techniques such as Track-and-Stop to provably achieve optimality. To overcome this difficulty, we introduce several new algorithmic and analytical techniques and propose an algorithm that achieves the asymptotic lower bound with matching constants in the leading term. Our method combines a confidence ellipsoid-based stopping condition with a two-phase sampling strategy tailored to manage non-convexity proposed algorithm is simple, nearly free of hyperparameters, and achieves the instance-dependent, asymptotically optimal sample complexity. Experimental results support our theoretical guarantees and demonstrate the practical effectiveness of our method.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Jun-14-2026, 09:53:03 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Industry:
- Health & Medicine (0.93)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (0.40)
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Representation & Reasoning > Optimization (0.49)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found