Rates of Convergence of Performance Gradient Estimates Using Function Approximation and Bias in Reinforcement Learning

Grudic, Gregory Z., Ungar, Lyle H.

Neural Information Processing Systems 

We address two open theoretical questions in Policy Gradient Reinforcement Learning. The first concerns the efficacy of using function approximation to represent the state action value function,.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found