Goal Reduction with Loop-Removal Accelerates RL and Models Human Brain Activity in Goal-Directed Learning

Neural Information Processing Systems 

The pressure for survival prohibits slow, linear adaptation to different goals, i.e., learning value functions from scratch for each new objective.