Mallows-DPO: Fine-Tune Your LLM with Preference Dispersions

Open in new window