Asynchronous, Option-Based Multi-Agent Policy Gradient: A Conditional Reasoning Approach

Lyu, Xubo, Banitalebi-Dehkordi, Amin, Chen, Mo, Zhang, Yong

Aug-2-2023–arXiv.org Artificial Intelligence

Cooperative multi-agent problems often require coordination between agents, which can be achieved through a centralized policy that considers the global state. Multi-agent policy gradient (MAPG) methods are commonly used to learn such policies, but they are often limited to problems with low-level action spaces. In complex problems with large state and action spaces, it is advantageous to extend MAPG methods to use higher-level actions, also known as options, to improve the policy search efficiency. However, multi-robot option executions are often asynchronous, that is, agents may select and complete their options at different time steps. This makes it difficult for MAPG methods to derive a centralized policy and evaluate its gradient, as centralized policy always select new options at the same time. In this work, we propose a novel, conditional reasoning approach to address this problem and demonstrate its effectiveness on representative option-based multi-agent cooperative tasks through empirical validation. Find code and videos at: \href{https://sites.google.com/view/mahrlsupp/}{https://sites.google.com/view/mahrlsupp/}

agent, artificial intelligence, trajectory, (16 more...)

arXiv.org Artificial Intelligence

Aug-2-2023

arXiv.org PDF

Add feedback

Country:
- North America > Canada (0.14)
- South America > Brazil (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents
    - Agent Societies (0.35)
  - Robots (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found