Understanding Reinforcement Learning Hands-On: The Bellman Equation pt.1

#artificialintelligence 

Welcome to the fifth entry on a series on Reinforcement Learning. In the previous article, we presented the MDP Framework for describing complex environments. This allowed us to create a more robust and diverse scenario for the basic Multi-Armed Bandits problem, which we called the Casinos Environment. We then implemented this scenario using OpenAI's gym, and made a simple agent that acted randomly to showcase how an interaction is realized under the MDP Framework. Today, we're going to focus back on the agents, and show a way in which we can describe an agent's behavior in complex scenarios, where past actions determine future rewards.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found