DPO-Shift: Shifting the Distribution of Direct Preference Optimization

Open in new window