Reinforcement Learning for Solving Stochastic Vehicle Routing Problem with Time Windows