TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine

Jun-14-2026, 04:22:49 GMT–Neural Information Processing Systems

Traditional Chinese Medicine (TCM), as an effective alternative medicine, has been receiving increasing attention. In recent years, the rapid development of large language models (LLMs) tailored for TCM has highlighted the urgent need for an objective and comprehensive evaluation framework to assess their performance on real-world tasks. However, existing evaluation datasets are limited in scope and primarily text-based, lacking a unified and standardized multimodal question-answering (QA) benchmark. To address this issue, we introduce TCM-Ladder, the first comprehensive multimodal QA dataset specifically designed for evaluating large TCM language models. The dataset covers multiple core disciplines of TCM, including fundamental theory, diagnostics, herbal formulas, internal medicine, surgery, pharmacognosy, and pediatrics.

artificial intelligence, large language model, natural language, (8 more...)

Neural Information Processing Systems

Jun-14-2026, 04:22:49 GMT

Conferences Web Page

Add feedback

Industry:
- Health & Medicine > Therapeutic Area (0.58)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.63)