Distributed Cooperative Decision Making in Multi-agent Multi-armed Bandits

Landgren, Peter, Srivastava, Vaibhav, Leonard, Naomi Ehrich

Mar-2-2020–arXiv.org Machine Learning

We study a distributed decision-making problem in which multiple agents face the same multi-armed bandit (MAB), and each agent makes sequential choices among arms to maximize its own individual reward. The agents cooperate by sharing their estimates over a fixed communication graph. We consider an unconstrained reward model in which two or more agents can choose the same arm and collect independent rewards. And we consider a constrained reward model in which agents that choose the same arm at the same time receive no reward. We design a dynamic, consensus-based, distributed estimation algorithm for cooperative estimation of mean rewards at each arm. We leverage the estimates from this algorithm to develop two distributed algorithms: coop-UCB2 and coop-UCB2-selective-learning, for the unconstrained and constrained reward models, respectively. We show that both algorithms achieve group performance close to the performance of a centralized fusion center. Further, we investigate the influence of the communication graph structure on performance. We propose a novel graph explore-exploit index that predicts the relative performance of groups in terms of the communication graph, and we propose a novel nodal explore-exploit centrality index that predicts the relative performance of agents in terms of the agent locations in the communication graph.

agent, algorithm, artificial intelligence, (17 more...)

arXiv.org Machine Learning

Mar-2-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New Jersey (0.04)
  - Nevada > Clark County
    - Las Vegas (0.04)
  - Michigan > Ingham County
    - Lansing (0.04)
    - East Lansing (0.04)
  - California > Los Angeles County
    - Los Angeles (0.14)
- Europe
  - United Kingdom > England
    - Oxfordshire > Oxford (0.04)
  - Denmark > North Jutland
    - Aalborg (0.04)
- Asia > Japan
  - Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found