Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
–Neural Information Processing Systems
This paper extends previous normative work in RL by adopting an axiomatic approach to the aggregation of objectives.
Neural Information Processing Systems
Nov-13-2025, 10:03:44 GMT