Reviews: Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning
–Neural Information Processing Systems
This work is an interesting contribution to deep RL that considers using Anderson acceleration to improve off-policy TD based algorithms. The approach is supported by some theory as well as experiments on standard benchmark problems. Overall, reviewers like the paper and agree it should be accepted.
Neural Information Processing Systems
Jan-26-2025, 17:24:01 GMT
- Technology: