Dissecting Reinforcement Learning-Part.3

Jan-7-2018, 01:02:07 GMT–#artificialintelligence

The update rule is based on the tuple State-Reward-State. Remember that now we are in the control case. Here we use the Q-function (see second post) to estimate the best policy. The Q-function requires as input a state-action pair.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

#artificialintelligence

Jan-7-2018, 01:02:07 GMT

News Web Page

Add feedback

Industry:
- Transportation (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found