Can Zero-Shot Commercial APIs Deliver Regulatory-Grade Clinical Text DeIdentification?

Kocaman, Veysel, Santas, Muhammed, Gul, Yigit, Butgul, Mehmet, Talby, David

Mar-31-2025–arXiv.org Artificial Intelligence

We evaluate the performance of four leading solutions for de-identification of unstructured medical text - Azure Health Data Services, AWS Comprehend Medical, OpenAI GPT-4o, and John Snow Labs - on a ground truth dataset of 48 clinical documents annotated by medical experts. The analysis, conducted at both entity-level and token-level, suggests that John Snow Labs' Medical Language Models solution achieves the highest accuracy, with a 96% F1-score in protected health information (PHI) detection, outperforming Azure (91%), AWS (83%), and GPT-4o (79%). John Snow Labs is not only the only solution which achieves regulatory-grade accuracy (surpassing that of human experts) but is also the most cost-effective solution: It is over 80% cheaper compared to Azure and GPT-4o, and is the only solution not priced by token. Its fixed-cost local deployment model avoids the escalating per-request fees of cloud-based services, making it a scalable and economical choice.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Mar-31-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.46)
- Europe > Italy (0.04)

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (0.94)

Industry:
- Law (1.00)
- Information Technology
  - Security & Privacy (1.00)
  - Services (0.90)
- Health & Medicine
  - Government Relations & Public Policy (0.89)
  - Health Care Technology > Medical Record (0.71)
  - Health Care Providers & Services (0.69)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found