Batched Dueling Bandits

Agarwal, Arpit, Ghuge, Rohan, Nagarajan, Viswanath

Feb-21-2022–arXiv.org Machine Learning

The K-armed dueling bandits problem has been widely studied in machine learning due to its applications in search ranking, recommendation systems, sports ranking, etc. [3, 14, 16, 26, 29, 30, 34, 38, 41, 43-46]. It is a variation of the traditional stochastic bandit problem in which feedback is obtained in the form of pairwise preferences. This problem falls under the umbrella of preference learning [39], where the goal is to learn from relative feedback (in our case, given two alternatives, which of the two is preferred). Designing learning algorithms for such relative feedback becomes crucial in domains where qualitative feedback is easily obtained, but real-valued feedback would be arbitrary or not interpretable. We illustrate this using the web-search ranking application. Web-search ranking is an example of a complex information retrieval system, where the goal is to provide a list (usually ranked) of candidate documents to the user of the system in response to a query [25, 27, 33, 42]. Modern day search engines comprise hundreds of parameters which are used to output a ranked list in response to a query. However, manually tuning these parameters can sometimes be infeasible, and online learning frameworks (based on user feedback) have been invaluable in automatically tuning these parameters [31]. These methods do not affect user experience, enable the system to continuously learn about user preferences, and thus continuously adapt to user behavior.

algorithm, bandit, proceedings, (13 more...)

arXiv.org Machine Learning

Feb-21-2022

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - District of Columbia > Washington (0.04)
    - New York > New York County
      - New York City (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe
  - Sweden > Stockholm
    - Stockholm (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (0.88)
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Natural Language > Information Retrieval (0.88)
    - Representation & Reasoning > Personal Assistant Systems (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found