Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards

Neural Information Processing Systems 

This paper extends previous normative work in RL by adopting an axiomatic approach to the aggregation of objectives.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found