Reinforcement Learning w/ Keras OpenAI: DQNs – Towards Data Science
Q-learning (which doesn't stand for anything, by the way) is centered around creating a "virtual table" that accounts for how much reward is assigned to each possible action given the current state of the environment. Let's break that down one step at a time: What do we mean by "virtual table?" Imagine that for each possible configuration of the input space, you have a table that assigns a score for each of the possible actions you can take. If this were magically possible, then it would be extremely easy for you to "beat" the environment: simply choose the action that has the highest score! Two points to note about this score.
May-2-2018, 17:26:23 GMT
- Technology: