RobustImitationvia MirrorDescentInverseReinforcementLearning

Neural Information Processing Systems 

Inspired by a first-order optimization method called mirror descent, this paper proposes topredict asequence ofrewardfunctions, which areiterativesolutions for a constrained convex problem. IRL solutions derived by mirror descent are tolerant totheuncertainty incurred bytargetdensity estimation sincetheamount of reward learning is regulated with respect to local geometric constraints.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found