Simplified: Off-Policy vs On-Policy in Reinforcement Learning
Early on when learning Reinforcement Learning you may encounter such distinction between algorithms -- some are on-policy some off-policy. You may read many explanations, but still, ask the question: what the hell is the difference? Let's try to clarify this concept once forever. I believe that the best way to do this is by example. So let's set up a simple environment.
Sep-12-2021, 11:55:30 GMT
- Technology: