Discounted Reinforcement Learning is Not an Optimization Problem
Naik, Abhishek, Shariff, Roshan, Yasui, Niko, Sutton, Richard S.
–arXiv.org Artificial Intelligence
Discounted reinforcement learning is fundamentally incom patible with function approximation for control in continuing tasks. This is beca use it is not an optimization problem -- it lacks an objective function. After s ubstantiating these claims, we go on to address some misconceptions about discou nting and its connection to the average reward formulation. W e encourage res earchers to adopt rigorous optimization approaches for reinforcement learn ing in continuing tasks, such as average reward.
arXiv.org Artificial Intelligence
Oct-4-2019