Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text
Mitra, Avijit, Druhl, Emily, Goodwin, Raelene, Yu, Hong
–arXiv.org Artificial Intelligence
Social and behavioral determinants of health (SBDH) play a crucial role in health outcomes and are frequently documented in clinical text. Automatically extracting SBDH information from clinical text relies on publicly available good-quality datasets. However, existing SBDH datasets exhibit substantial limitations in their availability and coverage. In this study, we introduce Synth-SBDH, a novel synthetic dataset with detailed SBDH annotations, encompassing status, temporal information, and rationale across 15 SBDH categories. We showcase the utility of Synth-SBDH on three tasks using real-world clinical datasets from two distinct hospital settings, highlighting its versatility, generalizability, and distillation capabilities. Models trained on Synth-SBDH consistently outperform counterparts with no Synth-SBDH training, achieving up to 62.5% macro-F improvements. Additionally, Synth-SBDH proves effective for rare SBDH categories and under-resource constraints. Human evaluation demonstrates a Human-LLM alignment of 71.06% and uncovers areas for future refinements.
arXiv.org Artificial Intelligence
Jun-10-2024
- Country:
- Asia > Middle East
- Iraq > Muthanna Governorate (0.04)
- North America > United States
- Massachusetts
- Hampshire County > Amherst (0.04)
- Middlesex County > Lowell (0.04)
- Massachusetts
- Asia > Middle East
- Genre:
- Research Report > New Finding (0.87)
- Industry:
- Government
- Health & Medicine
- Consumer Health (1.00)
- Health Care Providers & Services (1.00)
- Pharmaceuticals & Biotechnology (1.00)
- Therapeutic Area
- Neurology > Alzheimer's Disease (0.68)
- Oncology (0.68)
- Psychiatry/Psychology
- Addiction Disorder (0.68)
- Mental Health (0.67)
- Law > Criminal Law (0.93)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.93)
- Technology: