KEEP: Integrating Medical Ontologies with Clinical Data for Robust Code Embeddings
Elhussein, Ahmed, Meddeb, Paul, Newbury, Abigail, Mirone, Jeanne, Stoll, Martin, Gursoy, Gamze
–arXiv.org Artificial Intelligence
Machine learning in healthcare requires effective representation of structured medical codes, but current methods face a trade-off: knowledge graph-based approaches capture formal relationships but miss real-world patterns, while data-driven methods learn empirical associations but often overlook structured knowledge in medical terminologies. We present KEEP (Knowledge-preserving and Empirically refined Embedding Process), an efficient framework that bridges this gap by combining knowledge graph embeddings with adaptive learning from clinical data. KEEP first generates embeddings from knowledge graphs, then employs regularized training on patient records to adaptively integrate empirical patterns while preserving ontological relationships. Importantly, KEEP produces final embeddings without task-specific axillary or end-to-end training enabling KEEP to support multiple downstream applications and model architectures. Evaluations on structured EHR from UK Biobank and MIMIC-IV demonstrate that KEEP outperforms both traditional and Language Model-based approaches in capturing semantic relationships and predicting clinical outcomes. Moreover, KEEP's minimal computational requirements make it particularly suitable for resource-constrained environments. Data and Code Availability This research has been conducted using data from UK Biobank (Sud-low et al., 2015) and MIMIC-IV Johnson et al. (2021). Researchers can request access via https:// www.ukbiobank.ac.uk/ and https://physionet.
arXiv.org Artificial Intelligence
Oct-7-2025
- Country:
- North America > United States
- California > Santa Clara County
- Palo Alto (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York (0.04)
- California > Santa Clara County
- North America > United States
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Health & Medicine
- Health Care Providers & Services (1.00)
- Health Care Technology > Medical Record (0.67)
- Therapeutic Area
- Cardiology/Vascular Diseases (0.46)
- Endocrinology > Diabetes (0.48)
- Nephrology (0.68)
- Health & Medicine
- Technology: