Reviews: Shallow Updates for Deep Reinforcement Learning
–Neural Information Processing Systems
The authors propose to augment value-based methods for deep reinforcement learning (DRL) with batch methods for linear approximation function (SRL). The idea is motivated by interpreting the output of the second-to-last layer of a neural network as linear features. In order to make this combination work, the authors argue that regularization is needed. Experimental results are provided for 5 Atari games on combinations of DQN/Double DQN and LSTD-Q/FQI. Strengths: I find the proposition of combining DRL and SRL with Bayesian regularization original and promising.
Neural Information Processing Systems
Oct-7-2024, 18:27:01 GMT
- Technology: