Identifiabilityininversereinforcementlearning

Neural Information Processing Systems 

Inverse reinforcement learning attempts to reconstruct the reward function in a Markov decision problem, using observations of agent actions. As already observed in Russell [1998] the problem is ill-posed, and the reward function is not identifiable, even under the presence of perfect information about optimal behavior. We provide a resolution to this non-identifiability for problems with entropyregularization.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found