Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Sun, Jun, Wang, Gang, Giannakis, Georgios B., Yang, Qinmin, Yang, Zaiyue

Nov-3-2019–arXiv.org Machine Learning

Thanks to its generality, RL has been widely studied in many areas, such as control theory, game theory, operations research, multi-agent systems, machine learning, artificial intelligence, and statistics [23]. In recent years, combining with deep learning, RL has demonstrated its great potential in addressing challenging practical control and optimization problems [17, 21]. Among all possible algorithms, the temporal difference (TD) learning has arguably become one of the most popular RL algorithms so far, which is further dominated by the celebrated TD(0) algorithm [22]. TD learning provides an iterative process to update an estimate of the so-termed value function v π(s) with respect to a given policy π based on temporally successive samples. Dealing with a finite state space, the classical version of the TD(0) algorithm adopts a tabular representation for v π(s), which stores entry-wise value estimates on a per state basis. J. Sun and Q. Yang are with the College of Control Science and Engineering, and the State Key Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou, China. G. Wang and G. B. Giannakis are with the Digital Technology Center and the Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN 55455, USA. Z. Yang is with the Department of Mechanical and Energy Engineering, Southern University of Science and Technology, Shenzhen, China.

algorithm, decentralized td, function approximation, (13 more...)

arXiv.org Machine Learning

Nov-3-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County
    - New York City (0.04)
  - Minnesota > Hennepin County
    - Minneapolis (0.88)
  - Massachusetts > Middlesex County
    - Belmont (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - China
    - Zhejiang Province > Hangzhou (0.24)
    - Guangdong Province > Shenzhen (0.24)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning
    - Agents (1.00)
    - Uncertainty > Fuzzy Logic (0.42)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks > Deep Learning (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found