Benchmarking Large Language Models on CMExam -- A Comprehensive Chinese Medical Exam Dataset

Open in new window