Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
–Neural Information Processing Systems
This paper extends previous normative work in RL by adopting an axiomatic approach to the aggregation of objectives.
Neural Information Processing Systems
Oct-8-2025, 01:56:36 GMT