Combinatorial Pure Exploration of Dueling Bandit

Chen, Wei, Du, Yihan, Huang, Longbo, Zhao, Haoyu

Jun-23-2020–arXiv.org Machine Learning

In this paper, we study combinatorial pure exploration for dueling bandits (CPE-DB): we have multiple candidates for multiple positions as modeled by a bipartite graph, and in each round we sample a duel of two candidates on one position and observe who wins in the duel, with the goal of finding the best candidate-position matching with high probability after multiple rounds of samples. CPE-DB is an adaptation of the original combinatorial pure exploration for multi-armed bandit (CPE-MAB) problem to the dueling bandit setting. We consider both the Borda winner and the Condorcet winner cases. For Borda winner, we establish a reduction of the problem to the original CPE-MAB setting and design PAC and exact algorithms that achieve both the sample complexity similar to that in the CPE-MAB setting (which is nearly optimal for a subclass of problems) and polynomial running time per round. For Condorcet winner, we first design a fully polynomial time approximation scheme (FPTAS) for the offline problem of finding the Condorcet winner with known winning probabilities, and then use the FPTAS as an oracle to design a novel pure exploration algorithm ${\sf CAR}$-${\sf Cond}$ with sample complexity analysis. ${\sf CAR}$-${\sf Cond}$ is the first algorithm with polynomial running time per round for identifying the Condorcet winner in CPE-DB.

algorithm, combinatorial pure exploration, sample complexity, (10 more...)

arXiv.org Machine Learning

Jun-23-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Michigan (0.04)
  - New York > New York County
    - New York City (0.04)
- Europe > Austria
  - Vienna (0.14)
- Asia > China
  - Beijing > Beijing (0.04)
  - Jiangsu Province > Nanjing (0.04)

Genre:
- Research Report (0.50)

Industry:
- Government > Voting & Elections (1.00)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (0.66)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found