Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning

Shi, Wenjie, Song, Shiji, Wu, Hui, Hsu, Ya-Chu, Wu, Cheng, Huang, Gao

Mar-19-2020, 00:47:49 GMT–Neural Information Processing Systems

Model-free deep reinforcement learning (RL) algorithms have been widely used for a range of complex control tasks. However, slow convergence and sample inefficiency remain challenging problems in RL, especially when handling continuous and high-dimensional state spaces. To tackle this problem, we propose a general acceleration method for model-free, off-policy deep RL algorithms by drawing the idea underlying regularized Anderson acceleration (RAA), which is an effective approach to accelerating the solving of fixed point problems with perturbations. Specifically, we first explain how policy iteration can be applied directly with Anderson acceleration. Then we extend RAA to the case of deep RL by introducing a regularization term to control the impact of perturbation induced by function approximation errors.

algorithm, off-policy deep reinforcement learning, regularized anderson acceleration, (2 more...)

Neural Information Processing Systems

Mar-19-2020, 00:47:49 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)