Agent-Time Attention for Sparse Rewards Multi-Agent Reinforcement Learning

She, Jennifer, Gupta, Jayesh K., Kochenderfer, Mykel J.

Oct-31-2022–arXiv.org Artificial Intelligence

Cooperative multi-agent reinforcement learning (MARL) where a team of agents learn coordinated policies optimizing global team rewards has been extensively studied in recent years [25, 13], and find potential applications in a wide variety of domains like robot swarm control [15, 2], coordinating autonomous drivers [26, 41], network routing [38, 4], etc. Although cooperative MARL problems can be framed as a centralized single-agent, with the team as that actor with the joint action space, such an approach doesn't scale well. Joint action space grows exponentially with number of agents in such scenarios. Moreover, due to real world constraints on communication and observability, such framing is often not useful for a large number of real world applications. Unfortunately, simply independently learning decentralized policies based on local observations result into unstable learning and convergence issues due to non-stationarity from simultaneous exploration [12, 33]. This has resulted in MARL methods focusing on the centralized training decentralized execution (CTDE) paradigm, where during training decentralized polices can have access to extra state information during training but not during evaluation.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

Oct-31-2022

arXiv.org PDF

Add feedback

Country:
- North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents
    - Agent Societies (0.69)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found