On Symmetric Losses for Robust Policy Optimization with Noisy Preferences

Open in new window