Making Sense of the Bias / Variance Trade-off in (Deep) Reinforcement Learning
Since the launch of the ML-Agents platform a few months ago, I have been surprised and delighted to find that thanks to it and other tools like OpenAI Gym, a new, wider audience of individuals are building Reinforcement Learning (RL) environments, and using them to train state-of-the-art models. The ability to work with these algorithms, previously something reserved for ML PhDs, is opening up to a wider world. As a result, I have had the unique opportunity to not just write about applying RL to existing problems, but also to help developers and researchers debug their models in a more active way. In doing so, I often get questions which come down to a matter of understanding the unique hyperparameters and learning process around the RL paradigm. In this article, I want to attempt to highlight one of these conceptual pieces: bias and variance in RL, and attempt to demystify it to some extent.
Feb-3-2018, 15:56:16 GMT
- Technology: