Reviews: Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy

Open in new window