Structured Prompting Enables More Robust Evaluation of Language Models

Open in new window