Relational Weight Optimization for Enhancing Team Performance in Multi-Agent Multi-Armed Bandits

Kotturu, Monish Reddy, Movahed, Saniya Vahedian, Robinette, Paul, Jerath, Kshitij, Redlich, Amanda, Azadeh, Reza

Oct-30-2024–arXiv.org Artificial Intelligence

Using a graph to represent the team behavior ensures that the relationship between Multi-Armed Bandits (MABs) are a class of reinforcement the agents are held. However, existing works either do learning problems where an agent is presented with a set of not consider the weight of each relationship (graph edges) arms (i.e., actions), with each arm giving a reward drawn (Madhushani and Leonard, 2020; Agarwal et al., 2021) or from a probability distribution unknown to the agent expect the user to manually set those weights (Moradipari (Lattimore and Szepesvári, 2020). The goal of the agent et al., 2022). is to maximize its total reward which requires balancing In this paper, we propose a new approach that combines exploration and exploitation. MABs offer a simple model graph optimization and MAMAB algorithms to enhance to simulate decision-making under uncertainty. Practical team performance by expediting the convergence to consensus applications of MAB algorithms include news recommendations of arm means. Our proposed approach: (Yang and Toni, 2018), online ad placement (Aramayo et al., 2022), dynamic pricing (Babaioff et al., 2015), improves team performance by optimizing the edge and adaptive experimental design (Rafferty et al., 2019). In weights in the graph representing the team structure contrast to single-agent cases, in certain applications such in large constrained teams, as search and rescue, a team of agents should cooperate does not require manual tuning of the graph weights, with each other to accomplish goals by maximizing team is independent of the MAMAB algorithm and only performance. Such problems are solved using Multi-Agent depends on the consensus formula, and Multi-Armed Bandit (MAMAB) algorithms (Xu et al., formulates the problem as a convex optimization, which 2020). Most existing algorithms rely on the presence of is computationally efficient for large teams.

artificial intelligence, big data, data mining, (17 more...)

arXiv.org Artificial Intelligence

Oct-30-2024

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- North America > United States
  - Massachusetts > Middlesex County
    - Lowell (0.15)
  - Texas > Bexar County
    - San Antonio (0.14)

Genre:
- Research Report (1.00)

Industry:
- Education (0.34)

Technology:
- Information Technology
  - Artificial Intelligence > Representation & Reasoning
    - Agents (1.00)
  - Data Science > Data Mining
    - Big Data (1.00)