HealthContradict: Evaluating Biomedical Knowledge Conflicts in Language Models
Zhang, Boya, Bornet, Alban, Yang, Rui, Liu, Nan, Teodoro, Douglas
–arXiv.org Artificial Intelligence
How do language models use contextual information to answer health questions? How are their responses impacted by conflicting contexts? We assess the ability of language models to reason over long, conflicting biomedical contexts using HealthContradict, an expert-verified dataset comprising 920 unique instances, each consisting of a health-related question, a factual answer supported by scientific evidence, and two documents presenting contradictory stances. We consider several prompt settings, including correct, incorrect or contradictory context, and measure their impact on model outputs. Compared to existing medical question-answering evaluation benchmarks, HealthContradict provides greater distinctions of language models' contextual reasoning capabilities. Our experiments show that the strength of fine-tuned biomedical language models lies not only in their parametric knowledge from pretraining, but also in their ability to exploit correct context while resisting incorrect context.
arXiv.org Artificial Intelligence
Dec-3-2025
- Country:
- Asia
- Indonesia > Bali (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Singapore (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Switzerland > Geneva
- Geneva (0.04)
- Belgium > Brussels-Capital Region
- North America
- Canada > Ontario
- Toronto (0.04)
- Dominican Republic (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Florida > Miami-Dade County
- Miami (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Florida > Miami-Dade County
- Canada > Ontario
- Asia
- Genre:
- Research Report
- Experimental Study (0.94)
- New Finding (0.94)
- Research Report
- Industry:
- Education > Health & Safety
- School Nutrition (1.00)
- Health & Medicine
- Consumer Health (1.00)
- Pharmaceuticals & Biotechnology (1.00)
- Therapeutic Area
- Cardiology/Vascular Diseases (1.00)
- Endocrinology (0.68)
- Immunology (1.00)
- Infections and Infectious Diseases (1.00)
- Musculoskeletal (1.00)
- Neurology (1.00)
- Oncology (1.00)
- Psychiatry/Psychology (1.00)
- Education > Health & Safety
- Technology: