Game-Theoretic Regularized Self-Play Alignment of Large Language Models

Open in new window