Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment

Open in new window