Review for NeurIPS paper: Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

Jan-26-2025, 14:02:07 GMT–Neural Information Processing Systems

Correctness: The claims and experiments seem mostly correct. While the analysis shows that the solution to the min-max problem (Eq. I would increase my review if the paper were updated to include a proof that the proposed algorithm converges. One comment about the experiments is that they don't actually show that the proposed method mimics the expert, only that running the proposed algorithm with data generated from an expert results in high reward. I would increase my review if an experiment were added to show that the learned policy actually mimics the demonstrator.

adversarial soft advantage fitting, experiment, policy optimization, (4 more...)

Neural Information Processing Systems

Jan-26-2025, 14:02:07 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Robots (0.40)
  - Machine Learning (0.40)