Context-Aware Bayesian Network Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning
–arXiv.org Artificial Intelligence
Cooperative multi-agent reinforcement learning (MARL) methods equip a group of autonomous agents with the capability Executing actions in a correlated manner is a common of planning and learning to maximize their joint strategy for human coordination that often utility, or reward signals in the reinforcement learning (RL) leads to better cooperation, which is also potentially literature, which provides a promising paradigm for a range beneficial for cooperative multi-agent reinforcement of real-world applications, such as traffic control (Chu et al., learning (MARL). However, the recent 2019), coordination of multi-robot systems (Corke et al., success of MARL relies heavily on the convenient 2005), and power grid management (Callaway & Hiskens, paradigm of purely decentralized execution, 2010). As a key distinction from the single-agent setting, where there is no action correlation among agents multi-agent joint action spaces grow exponentially with for scalability considerations. In this work, we the number of agents, which imposes significant scalability introduce a Bayesian network to inaugurate correlations issues. As a convenient and commonly adopted solution, between agents' action selections in their most existing cooperative MARL methods only consider joint policy. Theoretically, we establish a theoretical product policies, i.e., each agent selects its local action independently justification for why action dependencies given the state or its observations. Restricting are beneficial by deriving the multi-agent policy to product policies, however, does come at a cost for cooperative gradient formula under such a Bayesian network tasks: consider an example where cars wait at a joint policy and proving its global convergence crossroads, it would be hard for the cars to coordinate their to Nash equilibria under tabular softmax policy movements without knowing others' intentions, potentially parameterization in cooperative Markov games.
arXiv.org Artificial Intelligence
Jun-2-2023
- Country:
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Genre:
- Research Report (0.81)
- Industry:
- Energy > Power Industry (0.54)