RoSE: Round-robin Synthetic Data Evaluation for Selecting LLM Generators without Human Test Sets