Quasi-Newton Trust Region Policy Optimization