Efficiently Escaping Saddle Points for Non-Convex Policy Optimization