Reinforcement Learning Demystified: Markov Decision Processes (Part 1)
In the previous blog post we talked about reinforcement learning and its characteristics. We mentioned the process of the agent observing the environment output consisting of a reward and the next state, and then acting upon that. This whole process is a Markov Decision Process or an MDP for short. This blog post is a bit mathy. Grab your coffee and a comfortable chair, and just dive in.
Apr-3-2019, 03:44:05 GMT