Optimality of Reinforcement Learning Algorithms with Linear Function Approximation
–Neural Information Processing Systems
There are several reinforcement learning algorithms that yield approximate solutionsfor the problem of policy evaluation when the value function is represented with a linear function approximator. In this paper we show that each of the solutions is optimal with respect to a specific objective function.
Neural Information Processing Systems
Dec-31-2003