Review for NeurIPS paper: Inverse Reinforcement Learning from a Gradient-based Learner

Jan-22-2025, 04:55:52 GMT–Neural Information Processing Systems

Weaknesses: I have several concerns about the proposed approach. First, the empirical results give mixed messages. In one out of three tasks (i.e., reacher), the LfL baseline significantly outperforms LOGEL (Figure 4, left). Whereas for another task (i.e., hopper), the policy trained with the reward function recovered by LOGEL outperforms the policy trained on the true reward function. And what kind of reward function does the LfL baseline recover for the hopper task, that leads to no learning at all?

gradient-based learner, inverse reinforcement learning, reward function, (4 more...)

Neural Information Processing Systems

Jan-22-2025, 04:55:52 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)