Detecting Bias and Enhancing Diagnostic Accuracy in Large Language Models for Healthcare
Zahraei, Pardis Sadat, Shakeri, Zahra
–arXiv.org Artificial Intelligence
Biased AI-generated medical advice and misdiagnoses can jeopardize patient safety, making the integrity of AI in healthcare more critical than ever. As Large Language Models (LLMs) take on a growing role in medical decision-making, addressing their biases and enhancing their accuracy is key to delivering safe, reliable care. This study addresses these challenges head-on by introducing new resources designed to promote ethical and precise AI in healthcare. We present two datasets: BiasMD, featuring 6,007 question-answer pairs crafted to evaluate and mitigate biases in health-related LLM outputs, and DiseaseMatcher, with 32,000 clinical question-answer pairs spanning 700 diseases, aimed at assessing symptom-based diagnostic accuracy. Using these datasets, we developed the EthiClinician, a fine-tuned model built on the ChatDoctor framework, which outperforms GPT-4 in both ethical reasoning and clinical judgment. By exposing and correcting hidden biases in existing models for healthcare, our work sets a new benchmark for safer, more reliable patient outcomes.
arXiv.org Artificial Intelligence
Oct-9-2024
- Country:
- Europe > Middle East
- Malta (0.14)
- North America
- Canada > Ontario
- Toronto (0.14)
- Mexico > Mexico City (0.14)
- Canada > Ontario
- Europe > Middle East
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Health & Medicine
- Consumer Health (1.00)
- Diagnostic Medicine (1.00)
- Therapeutic Area
- Gastroenterology (1.00)
- Immunology (1.00)
- Infections and Infectious Diseases (1.00)
- Musculoskeletal (1.00)
- Neurology (1.00)
- Psychiatry/Psychology (0.68)
- Health & Medicine
- Technology: