TowardstheTransferabilityofRewardsRecovered viaRegularizedInverseReinforcementLearning
–Neural Information Processing Systems
Misalignedrewards can lead to suboptimal behaviors [Ngo et al., 2022], undermining the potential benefits of RL in practical scenarios.
Neural Information Processing Systems
Feb-9-2026, 16:45:28 GMT