Understanding the Pathologies of Approximate Policy Evaluation when Combined with Greedification in Reinforcement Learning

Open in new window