Models of Delayed Reinforcement Learning