Multi-Grid Methods for Reinforcement Learning in Controlled Diffusion Processes