PIC: Permutation Invariant Critic for Multi-Agent Deep Reinforcement Learning

Liu, Iou-Jen, Yeh, Raymond A., Schwing, Alexander G.

Oct-31-2019–arXiv.org Machine Learning

Single-agent deep reinforcement learning has achieved impressive performance in many domains, including playing Go [1, 2] and Atari games [3, 4]. However, many real world problems, such as traffic congestion reduction [5, 6], antenna tilt control [7], and dynamic resource allocation [8] are more naturally modeled as multi-agent systems. Unfortunately, directly deploying single-agent reinforcement learning to each agent in a multi-agent system does not result in satisfying performance [9, 10]. Particularly, in multi-agent reinforcement learning [8, 10-19], estimating the value function is challenging, because the environment is non-stationary from the perspective of an individual agent [10, 11]. To alleviate the issue, recently, multi-agent deep deterministic policy gradient (MADDPG) [10] proposed a centralized critic whose input is the concatenation of all agents' observations and actions.

agent, mlp critic, permutation invariant critic, (12 more...)

arXiv.org Machine Learning

Oct-31-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Illinois > Champaign County > Champaign (0.04)
- Asia > Japan
  - Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre:
- Research Report
  - New Finding (0.46)
  - Experimental Study (0.30)

Industry:
- Leisure & Entertainment > Games > Computer Games (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents
    - Agent Societies (0.50)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks > Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found