Similarity as Reward Alignment: Robust and Versatile Preference-based Reinforcement Learning

Open in new window