Reinforcement Learning Based on On-Line EM Algorithm
–Neural Information Processing Systems
The actor and the critic are approximated by Normalized Gaussian Networks (NGnet), which are networks of local linear regression units. The NGnet is trained by the online EM algorithm proposed in our previous paper.We apply our RL method to the task of swinging-up and stabilizing a single pendulum and the task of balancing a double pendulumnear the upright position.
Neural Information Processing Systems
Dec-31-1999