C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models

Jan-19-2025, 21:57:34 GMT–Neural Information Processing Systems

New NLP benchmarks are urgently needed to align with the rapid development of large language models (LLMs). We present C-Eval, the first comprehensive Chinese evaluation suite designed to assess advanced knowledge and reasoning abilities of foundation models in a Chinese context. C-Eval comprises multiple-choice questions across four difficulty levels: middle school, high school, college, and professional. The questions span 52 diverse disciplines, ranging from humanities to science and engineering. C-Eval is accompanied by C-Eval Hard, a subset of very challenging subjects in C-Eval that requires advanced reasoning abilities to solve.

c-eval, foundation model, multi-level multi-discipline chinese evaluation suite, (2 more...)

Neural Information Processing Systems

Jan-19-2025, 21:57:34 GMT

Conferences Web Page

Add feedback

Industry:
- Education > Educational Setting > K-12 Education (0.63)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)