Reviews: Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning

Neural Information Processing Systems 

This work is an interesting contribution to deep RL that considers using Anderson acceleration to improve off-policy TD based algorithms. The approach is supported by some theory as well as experiments on standard benchmark problems. Overall, reviewers like the paper and agree it should be accepted.