Fast deep reinforcement learning using online adjustments from the past