Quantile Reinforcement Learning
I've been exploring reinforcement learning that takes advantage of uncertainty. In particular, I have implemented a basic version of QR-DQN-1 from Distributional Reinforcement Learning with Quantile Regression. Doing so required filling in some practical details from the paper, which I'm going to explain here. The approach is an extension of Deep Q-learning, which involves attempting to learn the value of being in a given state and taking an action to maximize this value (for more background, see this post). We think of the value of being in a state as a random variable drawn from some unknown distribution.
Jul-31-2020, 22:30:18 GMT
- Technology: