Adaptive Margin RLHF via Preference over Preferences

Open in new window