Deep reinforcement learning for weakly coupled MDP's with continuous actions