Report Cards: Qualitative Evaluation of Language Models Using Natural Language Summaries

Open in new window