Planning with General Objective Functions: Going Beyond Total Rewards Ruosong Wang

Neural Information Processing Systems 

This "small" difference requires the agent to change the planning strategy significantly because the