Investigation on the generalization of the Sampled Policy Gradient algorithm