Reviews: Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update

Neural Information Processing Systems 

All reviewers recommend accepting the paper. The authors response did address most of the reviewers' concerns. While the AC recommends accepting the paper, the AC encourages the authors to consider the comments of reviewer 1. Only changing the backup mechanism keeping all other hyper parameters fixed as in the Nature DQN model is indeed a good experimental setup. However, the optimal operation mode for different models might be different (even when sharing architectures and training protocols): for instance we could'afford' a larger learning rate if we have a better back-up mechanism.