A Clinical Trial Design Approach to Auditing Language Models in Healthcare Setting

Dec-18-2024–arXiv.org Artificial Intelligence

We present an audit mechanism for language models, with a focus on models deployed in the healthcare setting. Our proposed mechanism takes inspiration from clinical trial design where we posit the language model audit as a single blind equivalence trial, with the comparison of interest being the subject matter experts. We show that using our proposed method, we can follow principled sample size and power calculations, leading to the requirement of sampling minimum number of records while maintaining the audit integrity and statistical soundness. Finally, we provide a real-world example of the audit used in a production environment in a large-scale public health network.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

Dec-18-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States (0.28)
  - Canada > British Columbia (0.04)

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Information Technology > Security & Privacy (0.93)
- Health & Medicine
  - Therapeutic Area > Oncology (1.00)
  - Health Care Providers & Services (1.00)
  - Public Health (0.88)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found