Not All Preferences are What You Need for Post-Training: Selective Alignment Strategy for Preference Optimization

Open in new window