TowardstheTransferabilityofRewardsRecovered viaRegularizedInverseReinforcementLearning

Neural Information Processing Systems 

Misalignedrewards can lead to suboptimal behaviors [Ngo et al., 2022], undermining the potential benefits of RL in practical scenarios.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found