Is Optimal Transport Necessary for Inverse Reinforcement Learning?
Dong, Zixuan, Omori, Yumi, Ross, Keith
–arXiv.org Artificial Intelligence
Inverse Reinforcement Learning (IRL) aims to recover a rewa rd function from expert demonstrations. Recently, Optimal Transport (OT) m ethods have been successfully deployed to align trajectories and infer rewa rds. While OT -based methods have shown strong empirical results, they introduc e algorithmic complexity, hyperparameter sensitivity, and require solving the O T optimization problems. In this work, we challenge the necessity of OT in IRL by propos ing two simple, heuristic alternatives: (1) Minimum-Distance Reward, which assigns rewards based on the nearest expert state regardless of temporal ord er; and (2) Segment-Matching Reward, which incorporates lightweight temporal alignment by matching agent states to corresponding segments in the expert tra jectory. These methods avoid optimization, exhibit linear-time complexity, a nd are easy to implement. Through extensive evaluations across 32 online and offline b enchmarks with three reinforcement learning algorithms, we show that our simple rewards match or outperform recent OT -based approaches. Our findings suggest th at the core benefits of OT may arise from basic proximity alignment rather than it s optimal coupling formulation, advocating for reevaluation of complexity in future IRL design.
arXiv.org Artificial Intelligence
Jun-10-2025
- Country:
- Asia
- China > Shanghai
- Shanghai (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- China > Shanghai
- Asia
- Genre:
- Research Report > New Finding (1.00)
- Technology: