The Value Function Polytope in Reinforcement Learning

Open in new window