Stackelberg Game Preference Optimization for Data-Efficient Alignment of Language Models

Open in new window