Frame Semantic Patterns for Identifying Underreporting of Notifiable Events in Healthcare: The Case of Gender-Based Violence
Dutra, Lívia, Lorenzi, Arthur, Berno, Laís, Campos, Franciany, Biscardi, Karoline, Brown, Kenneth, Viridiano, Marcelo, Belcavello, Frederico, Matos, Ely, Guaranha, Olívia, Santos, Erik, Reinach, Sofia, Torrent, Tiago Timponi
–arXiv.org Artificial Intelligence
We introduce a methodology for the identification of notifiable events in the domain of healthcare. The methodology harnesses semantic frames to define fine-grained patterns and search them in unstructured data, namely, open-text fields in e-medical records. We apply the methodology to the problem of underreporting of gender-based violence (GBV) in e-medical records produced during patients' visits to primary care units. A total of eight patterns are defined and searched on a corpus of 21 million sentences in Brazilian Portuguese extracted from e-SUS APS. The results are manually evaluated by linguists and the precision of each pattern measured. Our findings reveal that the methodology effectively identifies reports of violence with a precision of 0.726, confirming its robustness. Designed as a transparent, efficient, low-carbon, and language-agnostic pipeline, the approach can be easily adapted to other health surveillance contexts, contributing to the broader, ethical, and explainable use of NLP in public health systems.
arXiv.org Artificial Intelligence
Nov-3-2025
- Country:
- South America > Brazil (0.69)
- Genre:
- Research Report > New Finding (0.34)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.85)
- Law > Criminal Law (0.85)
- Health & Medicine
- Health Care Technology > Medical Record (0.70)
- Public Health (0.69)
- Health Care Providers & Services (0.68)
- Therapeutic Area > Psychiatry/Psychology (0.46)
- Technology: