A Simple Intro to Q-Learning in R: Floor Plan Navigation
The question to be answered here is: What's the best way to get from Room 2 to Room 5 (outside)? Notice that by answering this question using reinforcement learning, we will also know how to find optimal routes from any room to outside. And if we run the iterative algorithm again for a new target state, we can find out the optimal route from any room to that new target state. Since Q-Learning is model-free, we don't need to know how likely it is that our agent will move between any room and any other room (the transition probabilities). If you had observed the behavior in this system over time, you might be able to find that information, but it many cases it just isn't available.
Apr-30-2018, 22:01:13 GMT