Rectifying Shortcut Behaviors in Preference-based Reward Learning

Open in new window