Is Optimal Transport Necessary for Inverse Reinforcement Learning?

Dong, Zixuan, Omori, Yumi, Ross, Keith

arXiv.org Artificial Intelligence 

Inverse Reinforcement Learning (IRL) aims to recover a rewa rd function from expert demonstrations. Recently, Optimal Transport (OT) m ethods have been successfully deployed to align trajectories and infer rewa rds. While OT -based methods have shown strong empirical results, they introduc e algorithmic complexity, hyperparameter sensitivity, and require solving the O T optimization problems. In this work, we challenge the necessity of OT in IRL by propos ing two simple, heuristic alternatives: (1) Minimum-Distance Reward, which assigns rewards based on the nearest expert state regardless of temporal ord er; and (2) Segment-Matching Reward, which incorporates lightweight temporal alignment by matching agent states to corresponding segments in the expert tra jectory. These methods avoid optimization, exhibit linear-time complexity, a nd are easy to implement. Through extensive evaluations across 32 online and offline b enchmarks with three reinforcement learning algorithms, we show that our simple rewards match or outperform recent OT -based approaches. Our findings suggest th at the core benefits of OT may arise from basic proximity alignment rather than it s optimal coupling formulation, advocating for reevaluation of complexity in future IRL design.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found