Benchmarking Large Language Models on CMExam - A comprehensive Chinese Medical Exam Dataset

Open in new window