Policy Optimization Through Approximated Importance Sampling