Variance Reduced Advantage Estimation with $\delta$ Hindsight Credit Assignment

Nov-19-2019–arXiv.org Artificial Intelligence

Hindsight Credit Assignment (HCA) refers to a recently proposed family of methods for producing more efficient credit assignment in reinforcement learning. These methods work by explicitly estimating the probability that certain actions were taken in the past given present information. Prior work has studied the properties of such methods and demonstrated their behaviour empirically. We extend this work by introducing a particular HCA algorithm which has provably lower variance than the conventional Monte-Carlo estimator when the necessary functions can be estimated exactly. This result provides a strong theoretical basis for how HCA could be broadly useful.

advantage estimator, estimator, variance, (14 more...)

arXiv.org Artificial Intelligence

Nov-19-2019

arXiv.org PDF

Add feedback

Country:
- North America > Canada
  - Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found