r/MachineLearning - [R] Discounted Reinforcement Learning Is Not an Optimization Problem

Dec-18-2019, 11:54:09 GMT–#artificialintelligence

If one policy has greater or equal value than the other, in all states, we might say the policy is better. The policy gradient paper guarantees that locally optimal policies can be found with function approximation. This functional returns either the long-term avg rewards or discounted cumulative rewards from a designated start state. In practice, one would obtain an unbiased estimator for the gradient of this functional w.r.t.

machinelearning, optimization problem, reinforcement learning, (1 more...)

#artificialintelligence

Dec-18-2019, 11:54:09 GMT

News Web Page

Add feedback

Industry:
- Media > News (0.40)

Technology:
- Information Technology
  - Communications > Social Media (0.76)
  - Artificial Intelligence
    - Representation & Reasoning > Optimization (0.40)
    - Machine Learning > Reinforcement Learning (0.40)