Off-Policy Correction For Multi-Agent Reinforcement Learning
Zawalski, Michał, Osiński, Błażej, Michalewski, Henryk, Miłoś, Piotr
–arXiv.org Artificial Intelligence
Multi-agent reinforcement learning (MARL) provides a framework for problems involving multiple interacting agents. Despite apparent similarity to the single-agent case, multi-agent problems are often harder to train and analyze theoretically. In this work, we propose MA-Trace, a new on-policy actor-critic algorithm, which extends V-Trace to the MARL setting. The key advantage of our algorithm is its high scalability in a multi-worker setting. To this end, MA-Trace utilizes importance sampling as an off-policy correction method, which allows distributing the computations with no impact on the quality of training. Furthermore, our algorithm is theoretically grounded - we prove a fixed-point theorem that guarantees convergence. We evaluate the algorithm extensively on the StarCraft Multi-Agent Challenge, a standard benchmark for multi-agent algorithms. MA-Trace achieves high performance on all its tasks and exceeds state-of-the-art results on some of them.
arXiv.org Artificial Intelligence
Nov-22-2021
- Country:
- Africa > Ethiopia (0.04)
- Oceania
- New Zealand > North Island
- Auckland Region > Auckland (0.04)
- Australia > New South Wales
- Sydney (0.04)
- New Zealand > North Island
- North America
- United States
- New York > New York County
- New York City (0.04)
- Massachusetts > Hampshire County
- Amherst (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California
- Los Angeles County > Long Beach (0.14)
- Santa Clara County > Stanford (0.04)
- New York > New York County
- Puerto Rico > San Juan
- San Juan (0.04)
- Canada
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- United States
- Europe
- Sweden > Stockholm
- Stockholm (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Poland > Masovia Province
- Warsaw (0.05)
- Sweden > Stockholm
- Genre:
- Research Report (0.64)
- Industry:
- Leisure & Entertainment > Games > Computer Games (0.50)