Can Zero-Shot Commercial APIs Deliver Regulatory-Grade Clinical Text DeIdentification?
Kocaman, Veysel, Santas, Muhammed, Gul, Yigit, Butgul, Mehmet, Talby, David
–arXiv.org Artificial Intelligence
We evaluate the performance of four leading solutions for de-identification of unstructured medical text - Azure Health Data Services, AWS Comprehend Medical, OpenAI GPT-4o, and John Snow Labs - on a ground truth dataset of 48 clinical documents annotated by medical experts. The analysis, conducted at both entity-level and token-level, suggests that John Snow Labs' Medical Language Models solution achieves the highest accuracy, with a 96% F1-score in protected health information (PHI) detection, outperforming Azure (91%), AWS (83%), and GPT-4o (79%). John Snow Labs is not only the only solution which achieves regulatory-grade accuracy (surpassing that of human experts) but is also the most cost-effective solution: It is over 80% cheaper compared to Azure and GPT-4o, and is the only solution not priced by token. Its fixed-cost local deployment model avoids the escalating per-request fees of cloud-based services, making it a scalable and economical choice.
arXiv.org Artificial Intelligence
Mar-31-2025
- Country:
- Europe > Italy (0.04)
- North America > United States (0.46)
- Genre:
- Research Report
- Experimental Study (0.94)
- New Finding (1.00)
- Research Report
- Industry:
- Health & Medicine
- Information Technology
- Security & Privacy (1.00)
- Services (0.90)
- Law (1.00)
- Technology: