How to fix reinforcement learning

Apr-20-2020, 00:27:58 GMT–#artificialintelligence

"Value functions are a core component of [RL] systems. The main idea is to to construct a single function approximator V(s; θ) that estimates the long-term reward from any state s, using parameters θ. In this paper we introduce universal value function approximators (UVFAs) V(s, g; θ) that generalise not just over states s but also over goals g." Here is a rigorous, mathematical formulation of RL that treats goals (the high-level objective of the skill to be learned, which should yield good rewards) as a fundamental and necessary input rather than something to be discovered from just the reward signal. The agent is told what it's supposed to do, just as is done in zero-shot learning and actual human learning. It has been 3 years since this was published, and how many papers have cited it since?

agent, learning, reinforcement, (15 more...)

#artificialintelligence

Apr-20-2020, 00:27:58 GMT

News Web Page

Add feedback

Genre:
- Research Report (0.69)

Industry:
- Education (0.94)
- Leisure & Entertainment > Games (0.72)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks > Deep Learning (0.31)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found