Training Language Models to Win Debates with Self-Play Improves Judge Accuracy

Open in new window