Inverse Reinforcement Learning with the Average Reward Criterion
–Neural Information Processing Systems
We study the problem of Inverse Reinforcement Learning (IRL) with an average-reward criterion. The goal is to recover an unknown policy and a reward function when the agent only has samples of states and actions from an experienced agent. Previous IRL methods assume that the expert is trained in a discounted environment, and the discount factor is known.
Neural Information Processing Systems
Dec-26-2025, 22:42:58 GMT
- Technology: