Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards

Neural Information Processing Systems 

This paper extends previous normative work in RL by adopting an axiomatic approach to the aggregation of objectives.