Reviews: Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

Feb-4-2025, 23:37:31 GMT–Neural Information Processing Systems

Learning from Observation (LoF) is harder, but more practical, than Learning from Demonstration (LfD) that involves both action and state supervisions. The paper studies the difference between the two types of learning in both theoretical and practical perspectives, and relates the gap between LfD and LfO to inverse dynamics disagreement between the imitator and the expert. The paper includes an elaborate and interesting theoretical analysis of this gap, and proposes a method for bridging the gap through entropy maximization. The empirical evaluation is also thorough and includes both a toy problem for studying the effect of inverse dynamics discrepancy, MuJoCO problems and an ablation study. The reviewers are in agreement that this is a good, technically sound paper.

learning, minimizing inverse dynamic disagreement, observation, (3 more...)

Neural Information Processing Systems

Feb-4-2025, 23:37:31 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.76)