Q-Learning for Mean-Field Controls

Gu, Haotian, Guo, Xin, Wei, Xiaoli, Xu, Renyuan

Feb-10-2020–arXiv.org Machine Learning

Multi-agent reinforcement learning (MARL) has been applied to many challenging problems including two-team computer games, autonomous drivings, and real-time biddings. Despite the empirical success, there is a conspicuous absence of theoretical study of different MARL algorithms: this is mainly due to the curse of dimensionality caused by the exponential growth of the joint state-action space as the number of agents increases. Mean-field controls (MFC) with infinitely many agents and deterministic flows, meanwhile, provide good approximations to $N$-agent collaborative games in terms of both game values and optimal strategies. In this paper, we study the collaborative MARL under an MFC approximation framework: we develop a model-free kernel-based Q-learning algorithm (CDD-Q) and show that its convergence rate and sample complexity are independent of the number of agents. Our empirical studies on MFC examples demonstrate strong performances of CDD-Q. Moreover, the CDD-Q algorithm can be applied to a general class of Markov decision problems (MDPs) with deterministic dynamics and continuous state-action space.

algorithm, assumption 3, theorem 3, (13 more...)

arXiv.org Machine Learning

Feb-10-2020

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > California
    - Los Angeles County > Los Angeles (0.14)
    - Santa Clara County > Stanford (0.04)
    - Alameda County > Berkeley (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe > United Kingdom
  - England > Oxfordshire > Oxford (0.14)

Genre:
- Research Report (0.40)

Industry:
- Information Technology (0.87)
- Leisure & Entertainment > Games (0.87)
- Transportation > Ground
  - Road (0.87)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Representation & Reasoning > Agents
    - Agent Societies (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found