Marginal Policy Gradients for Complex Control