Learning Optimal Control with Stochastic Models of Hamiltonian Dynamics