Difference of Convex Functions Programming for Reinforcement Learning