Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning

Neural Information Processing Systems 

In this paper we consider multi-objective reinforcement learning where the objectives are balanced using preferences.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found