PersonaEval: Are LLM Evaluators Human Enough to Judge Role-Play?

Open in new window