Thresholding Bandit with Optimal Aggregate Regret

Tao, Chao, Blanco, Saùl, Peng, Jian, Zhou, Yuan

May-27-2019–arXiv.org Machine Learning

We consider the thresholding bandit problem, whose goal is to find arms of mean rewards above a given threshold $\theta$, with a fixed budget of $T$ trials. We introduce LSA, a new, simple and anytime algorithm that aims to minimize the aggregate regret (or the expected number of mis-classified arms). We prove that our algorithm is instance-wise asymptotically optimal. We also provide comprehensive empirical results to demonstrate the algorithm's superior performance over existing algorithms under a variety of different scenarios.

aggregate regret, algorithm, exp, (13 more...)

arXiv.org Machine Learning

May-27-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Illinois (0.04)
  - New York (0.04)
  - Indiana > Monroe County
    - Bloomington (0.04)

Genre:
- Research Report > Experimental Study (0.46)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (0.67)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found