RobustImitationvia MirrorDescentInverseReinforcementLearning
–Neural Information Processing Systems
Inspired by a first-order optimization method called mirror descent, this paper proposes topredict asequence ofrewardfunctions, which areiterativesolutions for a constrained convex problem. IRL solutions derived by mirror descent are tolerant totheuncertainty incurred bytargetdensity estimation sincetheamount of reward learning is regulated with respect to local geometric constraints.
Neural Information Processing Systems
Feb-11-2026, 18:16:17 GMT
- Country:
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Genre:
- Research Report (0.46)
- Technology: