Minimax Optimal Algorithms for Adversarial Bandit Problem with Multiple Plays

Vural, N. Mert, Gokcesu, Hakan, Gokcesu, Kaan, Kozat, Suleyman S.

Nov-25-2019–arXiv.org Artificial Intelligence

We investigate the adversarial bandit problem with multiple plays under semi-bandit feedback. We introduce a highly efficient algorithm that asymptotically achieves the performance of the best switching $m$-arm strategy with minimax optimal regret bounds. To construct our algorithm, we introduce a new expert advice algorithm for the multiple-play setting. By using our expert advice algorithm, we additionally improve the best-known high-probability bound for the multi-play setting by $O(\sqrt{m})$. Our results are guaranteed to hold in an individual sequence manner since we have no statistical assumption on the bandit arm gains. Through an extensive set of experiments involving synthetic and real data, we demonstrate significant performance gains achieved by the proposed algorithm with respect to the state-of-the-art algorithms.

algorithm, exp3, exp4, (15 more...)

arXiv.org Artificial Intelligence

Nov-25-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia
  - Middle East > Republic of Türkiye
    - Ankara Province > Ankara (0.04)
  - Japan > Honshū
    - Chūbu > Nagano Prefecture > Nagano (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (1.00)
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Representation & Reasoning > Search (0.71)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found