Reviews: Repeated Inverse Reinforcement Learning
–Neural Information Processing Systems
The authors present a learning framework for inverse reinforcement learning wherein an agent provides policies for a variety of related tasks and a human determines whether or not the produced policies are acceptable or not. They present algorithms for learning a human's latent reward function over the tasks, and they give upper and lower bounds on the performance of the algorithms. They also address the setting where an agent's is "corrected" as it executes trajectories. This is a comprehensive theoretical treatment of a new conceptualization of IRL that I think is valuable. I have broad clarification/scoping questions and a few minor points.
Neural Information Processing Systems
Oct-8-2024, 03:46:41 GMT