Aligning Language Models with Preferences through f-divergence Minimization

Open in new window