DeepSeek performs better than other Large Language Models in Dental Cases

Zhang, Hexian, Yan, Xinyu, Yang, Yanqi, Jin, Lijian, Yang, Ping, Wang, Junwen

Sep-3-2025–arXiv.org Artificial Intelligence

Division of Epidemiology, Department of Quantitative Health Sciences, Mayo Clinic, Scottsdale, AZ 85259, USA Hexian Zhang: chordzhang@connect.hku.hk Tel: (852) 2852 0128, Fax: (852) 2548 9464 A bstract word count: 1 85 T otal word count: 31 67 T otal number of tables: 2 T otal number of figures: 3 N umber of references: 32 Keywords Artificial Intelligence, Deep Learning/Machine Learning, Dental Education, Electronic dental records, Periodontal Medicine Abstract Aims: Periodontology, with its wealth of structured clinical data, offers an ideal setting to evaluate the reasoning abilities of large language models (LLMs). This study aims to assess four LLMs (GPT - 4o, Gemini 2.0 Flash, Copilot, and DeepSeek V3) in interpreting longitudinal periodontal case vignettes through open - ended tasks. Materials and Methods: Thirty - four standardized longitudinal periodontal case vignettes were curated, generating 258 open - ended question - answer pairs. Each model was prompted to review case details and produce responses. Performance was evaluated using automated metrics (faithfulness, answer relevancy, readability) and blinded assessments by licensed dentists on a five - point Likert scale. Results: DeepSeek V3 achieved the highest median faithfulness score (0.528), outperforming GPT - 4o (0.457), Gemini 2.0 Flash (0.421), and Copilot (0.367).

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

Sep-3-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China
  - Hong Kong (0.06)
- North America > United States
  - Arizona > Maricopa County > Scottsdale (0.24)

Genre:
- Research Report
  - Experimental Study (1.00)
  - New Finding (1.00)

Industry:
- Health & Medicine > Therapeutic Area > Dental and Oral Health (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language
    - Chatbot (1.00)
    - Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found