Optimality of Reinforcement Learning Algorithms with Linear Function Approximation
–Neural Information Processing Systems
There are several reinforcement learning algorithms that yield ap(cid:173) proximate solutions for the problem of policy evaluation when the value function is represented with a linear function approximator. In this paper we show that each of the solutions is optimal with respect to a specific objective function. The results presented here will be useful for comparing the algorithms in terms of the error they achieve relative to the error of the optimal approximate solution.
Neural Information Processing Systems
Apr-6-2023, 16:18:27 GMT
- Technology: