Dual Approximation Policy Optimization