RoSE: Round-robin Synthetic Data Evaluation for Selecting LLM Generators without Human Test Sets

Open in new window