Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

Foerster, Jakob, Nardelli, Nantas, Farquhar, Gregory, Afouras, Triantafyllos, Torr, Philip H. S., Kohli, Pushmeet, Whiteson, Shimon

Jun-12-2017–arXiv.org Artificial Intelligence

Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods typically scale poorly in the problem size. Therefore, a key challenge is to translate the success of deep learning on single-agent RL to the multi-agent setting. A major stumbling block is that independent Q-learning, the most popular multi-agent RL method, introduces nonstationarity that makes it incompatible with the experience replay memory on which deep Q-learning relies. This paper proposes two methods that address this problem: 1) using a multi-agent variant of importance sampling to naturally decay obsolete data and 2) conditioning each agent's value function on a fingerprint that disambiguates the age of the data sampled from the replay memory. Results on a challenging decentralised variant of StarCraft unit micromanagement confirm that these methods enable the successful combination of experience replay with multi-agent RL.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

Jun-12-2017

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Leisure & Entertainment > Games > Computer Games (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found