Fixed-Confidence Guarantees for Bayesian Best-Arm Identification

Shang, Xuedong, de Heide, Rianne, Kaufmann, Emilie, Ménard, Pierre, Valko, Michal

Oct-28-2019–arXiv.org Machine Learning

In particular, we justify its use for fixed-confidence best-arm identification . We further propose a variant of TTTS called Top-Two Transportation Cost ( T3C), which disposes of the computational burden of TTTS . As our main contribution, we provide the first sample complexity analysis of TTTS and T3C when coupled with a very natural Bayesian stopping rule, for bandits with Gaussian rewards, solving one of the open questions raised by Russo (2016). We also provide new posterior convergence results for TTTS under two models that are commonly used in practice: bandits with Gaussian and Bernoulli rewards and conjugate priors. 1 Introduction In multi-armed bandits, a learner repeatedly chooses an arm to play, and receives a reward from the associated unknown probability distribution. When the task is best-arm identification (BAI), the learner is not only asked to sample an arm at each stage, but is also asked to output a recommendation (i.e., a guess for the arm with the largest mean reward) after a certain period.

exp null 2, poly, ttt, (16 more...)

arXiv.org Machine Learning

Oct-28-2019

arXiv.org PDF

Add feedback

Country:
- Europe > Netherlands > South Holland > Leiden (0.04)

Genre:
- Workflow (0.46)
- Research Report (0.40)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (0.66)
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Representation & Reasoning (0.92)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found