OntheEstimationBiasinDoubleQ-Learning

Neural Information Processing Systems 

One of the phenomena of interest is that Q-learning (Watkins, 1989) is known to suffer from overestimation issues, since it takes a maximum operator overaset ofestimated action-values.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found