A Natural Policy Gradient

Kakade, Sham M.

Neural Information Processing Systems 

These greedy optimal actions are those that would be chosen under one improvement step of policy iteration with approximate, compatible value functions, as defined by Sutton etal.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found