Multi-Player Approaches for Dueling Bandits

Raveh, Or, Honda, Junya, Sugiyama, Masashi

May-25-2024–arXiv.org Machine Learning

In decision-making under uncertainty, multi-armed bandit (MAB) [4] problems are a key paradigm with applications in recommendation systems and online advertising. These problems entail balancing exploration-exploitation trade-offs, as an agent draws from a set of K arms with unknown reward distributions to maximize cumulative rewards or minimize regret over time. Two notable variations of MAB include the dueling-bandit problem and the cooperative multiplayer MAB problem. In the dueling-bandit scenario [36], feedback comes from pairwise comparisons between K arms, useful in situations like human-feedback driven tasks, including ranker evaluation [25] and preference-based recommendation systems [10]. Meanwhile, the cooperative multiplayer MAB focuses on a group of M players collaboratively solving challenges in a distributed decisionmaking environment, enhancing learning through shared information. This approach finds applications in fields like multi-robot systems [19] and distributed recommender systems [27]. The M-player K-arm cooperative dueling bandit problem, combining aspects of the two previously studied variations, introduces a new dimension to cooperative decision-making with preference-based feedback, yet remains unexplored to the best of our knowledge.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Machine Learning

May-25-2024

arXiv.org PDF

Add feedback

Country:
- Asia > Japan > Honshū (0.14)

Genre:
- Research Report (0.64)

Industry:
- Consumer Products & Services > Restaurants (0.46)
- Energy > Oil & Gas
  - Upstream (0.34)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Representation & Reasoning > Personal Assistant Systems (0.88)
    - Robots (1.00)
  - Data Science > Data Mining
    - Big Data (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found