Control in Stochastic Environment with Delays: A Model-based Reinforcement Learning Approach