Export Reviews, Discussions, Author Feedback and Meta-Reviews
–Neural Information Processing Systems
This paper addresses the problem of inverse reinforcement learning when the agent can change it's objective during the recording of trajectories. This results in a transition between several reward functions that explain only locally the trajectory of the observed agent. Transition probabilities between reward functions are unknown. The author propose a cascade of an EM and Viterbi algorithms to discover the reward functions and the segments on which they are valid. The paper is quite well written. Yet the state of the art about IRL stops in 2012.
Neural Information Processing Systems
Feb-6-2025, 23:47:43 GMT
- Technology: