The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation

Rowland, Mark, Tang, Yunhao, Lyle, Clare, Munos, Rémi, Bellemare, Marc G., Dabney, Will

May-28-2023–arXiv.org Artificial Intelligence

We study the problem of temporal-differencebased In this paper, however, we reach a surprising conclusion: policy evaluation in reinforcement learning. Even in the tabular setting, there are many scenarios where In particular, we analyse the use of a distributional quantile temporal-difference learning (QTD; Dabney et al., reinforcement learning algorithm, quantile 2018b), a distributional RL algorithm which aims to learn temporal-difference learning (QTD), for this task.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

May-28-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Hampshire County
    - Amherst (0.04)
  - Hawaii > Honolulu County
    - Honolulu (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.14)
  - Germany > North Rhine-Westphalia
    - Upper Bavaria > Munich (0.04)
  - Czechia > South Moravian Region
    - Brno (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.81)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Neural Networks > Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found