Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

Chawla, Ronshee, Vial, Daniel, Shakkottai, Sanjay, Srikant, R.

May-30-2023–arXiv.org Artificial Intelligence

The multi-armed bandit (MAB) problem is a paradigm for seque ntial decision-making under uncertainty, which involves allocating a resource to an action, i n order to obtain a reward. MABs address the tradeoff between exploration and exploitation while mak ing sequential decisions. Owing to their utility in large-scale distributed systems (such as inform ation retrieval [ 38 ], advertising [ 8 ], etc.), an extensive study has been conducted on multi-agent versio ns of the classical MAB in the last few years. In multi-agent MABs, there are multiple agents learn ing a bandit and communicating over a network. The goal is to design communication strategies whi ch allow efficient exploration of arms across agents, so that they can perform better than single ag ent MAB algorithms. There exist many versions of multi-agent MABs in the literat ure (see Section 1.2 for an overview). We propose a new collaborative setting where each of the N agents is learning one of M stochastic MABs (where each of the bandits have K arms and M < N) to minimize the group cumulative regret, i.e., the sum of individual cumulative regrets of al l the agents.

agent, artificial intelligence, bandit, (15 more...)

arXiv.org Artificial Intelligence

May-30-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Illinois (0.14)
  - Texas (0.14)

Genre:
- Research Report (0.81)

Industry:
- Energy > Oil & Gas > Upstream (0.34)

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.82)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found