Consolidation via Policy Information Regularization in Deep RL for Multi-Agent Games

Malloy, Tyler, Klinger, Tim, Liu, Miao, Riemer, Matthew, Tesauro, Gerald, Sims, Chris R.

Nov-23-2020–arXiv.org Artificial Intelligence

This paper introduces an information-theoretic constraint on learned policy complexity in the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) reinforcement learning algorithm. Previous research with a related approach in continuous control experiments suggests that this method favors learning policies that are more robust to changing environment dynamics. The multi-agent game setting naturally requires this type of robustness, as other agents' policies change throughout learning, introducing a nonstationary environment. For this reason, recent methods in continual learning are compared to our approach, termed Capacity-Limited MADDPG. Results from experimentation in multi-agent cooperative and competitive tasks demonstrate that the capacity-limited approach is a good candidate for improving learning performance in these environments.

agent, information, objective, (16 more...)

arXiv.org Artificial Intelligence

Nov-23-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States > New York > Rensselaer County > Troy (0.04)

Genre:
- Research Report > Experimental Study (0.34)

Industry:
- Education (0.47)
- Leisure & Entertainment (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (1.00)
  - Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found