RePO: Understanding Preference Learning Through ReLU-Based Optimization

Open in new window