Reviews: Shallow Updates for Deep Reinforcement Learning

Neural Information Processing Systems 

The authors propose to augment value-based methods for deep reinforcement learning (DRL) with batch methods for linear approximation function (SRL). The idea is motivated by interpreting the output of the second-to-last layer of a neural network as linear features. In order to make this combination work, the authors argue that regularization is needed. Experimental results are provided for 5 Atari games on combinations of DQN/Double DQN and LSTD-Q/FQI. Strengths: I find the proposition of combining DRL and SRL with Bayesian regularization original and promising.