Discounted Reinforcement Learning is Not an Optimization Problem

Naik, Abhishek, Shariff, Roshan, Yasui, Niko, Sutton, Richard S.

Oct-4-2019–arXiv.org Artificial Intelligence

Discounted reinforcement learning is fundamentally incom patible with function approximation for control in continuing tasks. This is beca use it is not an optimization problem -- it lacks an objective function. After s ubstantiating these claims, we go on to address some misconceptions about discou nting and its connection to the average reward formulation. W e encourage res earchers to adopt rigorous optimization approaches for reinforcement learn ing in continuing tasks, such as average reward.

average reward, discount factor, representable policy, (13 more...)

arXiv.org Artificial Intelligence

Oct-4-2019

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada > Alberta (0.14)
  - United States (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (1.00)
  - Machine Learning > Reinforcement Learning (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found