Identifiabilityininversereinforcementlearning
–Neural Information Processing Systems
Inverse reinforcement learning attempts to reconstruct the reward function in a Markov decision problem, using observations of agent actions. As already observed in Russell [1998] the problem is ill-posed, and the reward function is not identifiable, even under the presence of perfect information about optimal behavior. We provide a resolution to this non-identifiability for problems with entropyregularization.
Neural Information Processing Systems
Feb-9-2026, 03:12:37 GMT
- Country:
- North America > United States
- Illinois > Cook County
- Chicago (0.04)
- New York > New York County
- New York City (0.04)
- Wisconsin > Dane County
- Madison (0.04)
- Illinois > Cook County
- North America > United States
- Technology: