comorbidity
- Oceania > Australia (0.04)
- South America > Colombia (0.04)
- Oceania > New Zealand (0.04)
- (3 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.94)
- Health & Medicine > Therapeutic Area > Dermatology (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.68)
Are LLMs Truly Multilingual? Exploring Zero-Shot Multilingual Capability of LLMs for Information Retrieval: An Italian Healthcare Use Case
Kembu, Vignesh Kumar, Morandini, Pierandrea, Ranzini, Marta Bianca Maria, Nocera, Antonino
Large Language Models (LLMs) have become a key topic in AI and NLP, transforming sectors like healthcare, finance, education, and marketing by improving customer service, automating tasks, providing insights, improving diagnostics, and personalizing learning experiences. Information extraction from clinical records is a crucial task in digital healthcare. Although traditional NLP techniques have been used for this in the past, they often fall short due to the complexity, variability of clinical language, and high inner semantics in the free clinical text. Recently, Large Language Models (LLMs) have become a powerful tool for better understanding and generating human-like text, making them highly effective in this area. In this paper, we explore the ability of open-source multilingual LLMs to understand EHRs (Electronic Health Records) in Italian and help extract information from them in real-time. Our detailed experimental campaign on comorbidity extraction from EHR reveals that some LLMs struggle in zero-shot, on-premises settings, and others show significant variation in performance, struggling to generalize across various diseases when compared to native pattern matching and manual annotations.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Italy > Lombardy > Milan (0.04)
- Health & Medicine > Health Care Technology > Medical Record (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.31)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
CODE-II: A large-scale dataset for artificial intelligence in ECG analysis
Abreu, Petrus E. O. G. B., Paixão, Gabriela M. M., Li, Jiawei, Gomes, Paulo R., Macfarlane, Peter W., Oliveira, Ana C. S., Carvalho, Vinicius T., Schön, Thomas B., Ribeiro, Antonio Luiz P., Ribeiro, Antônio H.
Data-driven methods for electrocardiogram (ECG) interpretation are rapidly progressing. Large datasets have enabled advances in artificial intelligence (AI) based ECG analysis, yet limitations in annotation quality, size, and scope remain major challenges. Here we present CODE-II, a large-scale real-world dataset of 2,735,269 12-lead ECGs from 2,093,807 adult patients collected by the Telehealth Network of Minas Gerais (TNMG), Brazil. Each exam was annotated using standardized diagnostic criteria and reviewed by cardiologists. A defining feature of CODE-II is a set of 66 clinically meaningful diagnostic classes, developed with cardiologist input and routinely used in telehealth practice. We additionally provide an open available subset: CODE-II-open, a public subset of 15,000 patients, and the CODE-II-test, a non-overlapping set of 8,475 exams reviewed by multiple cardiologists for blinded evaluation. A neural network pre-trained on CODE-II achieved superior transfer performance on external benchmarks (PTB-XL and CPSC 2018) and outperformed alternatives trained on larger datasets.
- South America > Brazil > Minas Gerais (0.24)
- Europe > Germany (0.04)
- Asia > China > Zhejiang Province > Ningbo (0.04)
- (8 more...)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Research Report > New Finding (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding
Wang, Hanyin, Wu, Zhenbang, Kolar, Gururaj, Korsapati, Hariprasad, Bartlett, Brian, Hull, Bryan, Sun, Jimeng
Diagnosis-Related Group (DRG) codes are essential for hospital reimbursement and operations but require labor-intensive assignment. Large Language Models (LLMs) struggle with DRG coding due to the out-of-distribution (OOD) nature of the task: pretraining corpora rarely contain private clinical or billing data. We introduce DRG-Sapphire, which uses large-scale reinforcement learning (RL) for automated DRG coding from clinical notes. Built on Qwen2.5-7B and trained with Group Relative Policy Optimization (GRPO) using rule-based rewards, DRG-Sapphire introduces a series of RL enhancements to address domain-specific challenges not seen in previous mathematical tasks. Our model achieves state-of-the-art accuracy on the MIMIC-IV benchmark and generates physician-validated reasoning for DRG assignments, significantly enhancing explainability. Our study further sheds light on broader challenges of applying RL to knowledge-intensive, OOD tasks. We observe that RL performance scales approximately linearly with the logarithm of the number of supervised fine-tuning (SFT) examples, suggesting that RL effectiveness is fundamentally constrained by the domain knowledge encoded in the base model. For OOD tasks like DRG coding, strong RL performance requires sufficient knowledge infusion prior to RL. Consequently, scaling SFT may be more effective and computationally efficient than scaling RL alone for such tasks.
Clinical characteristics, complications and outcomes of critically ill patients with Dengue in Brazil, 2012-2024: a nationwide, multicentre cohort study
Peres, Igor Tona, Ranzani, Otavio T., Bastos, Leonardo S. L., Hamacher, Silvio, Edinburgh, Tom, Garcia-Gallo, Esteban, Bozza, Fernando Augusto
Background. Dengue outbreaks are a major public health issue, with Brazil reporting 71% of global cases in 2024. Purpose. This study aims to describe the profile of severe dengue patients admitted to Brazilian Intensive Care units (ICUs) (2012-2024), assess trends over time, describe new onset complications while in ICU and determine the risk factors at admission to develop complications during ICU stay. Methods. We performed a prospective study of dengue patients from 253 ICUs across 56 hospitals. We used descriptive statistics to describe the dengue ICU population, logistic regression to identify risk factors for complications during the ICU stay, and a machine learning framework to predict the risk of evolving to complications. Visualisations were generated using ISARIC VERTEX. Results. Of 11,047 admissions, 1,117 admissions (10.1%) evolved to complications, including non-invasive (437 admissions) and invasive ventilation (166), vasopressor (364), blood transfusion (353) and renal replacement therapy (103). Age>80 (OR: 3.10, 95% CI: 2.02-4.92), chronic kidney disease (OR: 2.94, 2.22-3.89), liver cirrhosis (OR: 3.65, 1.82-7.04), low platelets (<50,000 cells/mm3; OR: OR: 2.25, 1.89-2.68), and high leukocytes (>7,000 cells/mm3; OR: 2.47, 2.02-3.03) were significant risk factors for complications. A machine learning tool for predicting complications was proposed, showing accurate discrimination and calibration. Conclusion. We described a large cohort of dengue patients admitted to ICUs and identified key risk factors for severe dengue complications, such as advanced age, presence of comorbidities, higher level of leukocytes and lower level of platelets. The proposed prediction tool can be used for early identification and targeted interventions to improve outcomes in dengue-endemic regions.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.05)
- South America > Peru (0.04)
- (10 more...)
- Research Report > Strength Medium (1.00)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Oceania > Australia (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- South America > Colombia (0.04)
- (4 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.94)
- Health & Medicine > Therapeutic Area > Dermatology (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.68)
Discovery of Disease Relationships via Transcriptomic Signature Analysis Powered by Agentic AI
Modern disease classification often overlooks molecular commonalities hidden beneath divergent clinical presentations. This study introduces a transcriptomics-driven framework for discovering disease relationships by analyzing over 1300 disease-condition pairs using GenoMAS, a fully automated agentic AI system. Beyond identifying robust gene-level overlaps, we develop a novel pathway-based similarity framework that integrates multi-database enrichment analysis to quantify functional convergence across diseases. The resulting disease similarity network reveals both known comorbidities and previously undocumented cross-category links. By examining shared biological pathways, we explore potential molecular mechanisms underlying these connections-offering functional hypotheses that go beyond symptom-based taxonomies. We further show how background conditions such as obesity and hypertension modulate transcriptomic similarity, and identify therapeutic repurposing opportunities for rare diseases like autism spectrum disorder based on their molecular proximity to better-characterized conditions. In addition, this work demonstrates how biologically grounded agentic AI can scale transcriptomic analysis while enabling mechanistic interpretation across complex disease landscapes. All results are publicly accessible at github.com/KeeeeChen/Pathway_Similarity_Network.
- North America > United States > Illinois > Champaign County > Urbana (0.14)
- South America > Brazil (0.04)
Medical Data Pecking: A Context-Aware Approach for Automated Quality Evaluation of Structured Medical Data
Girshovitz, Irena, Ambus, Atai, Shahar, Moni, Gilad-Bachrach, Ran
Background: The use of Electronic Health Records (EHRs) for epidemiological studies and artificial intelligence (AI) training is increasing rapidly. The reliability of the results depends on the accuracy and completeness of EHR data. However, EHR data often contain significant quality issues, including misrepresentations of subpopulations, biases, and systematic errors, as they are primarily collected for clinical and billing purposes. Existing quality assessment methods remain insufficient, lacking systematic procedures to assess data fitness for research. Methods: We present the Medical Data Pecking approach, which adapts unit testing and coverage concepts from software engineering to identify data quality concerns. We demonstrate our approach using the Medical Data Pecking Tool (MDPT), which consists of two main components: (1) an automated test generator that uses large language models and grounding techniques to create a test suite from data and study descriptions, and (2) a data testing framework that executes these tests, reporting potential errors and coverage. Results: We evaluated MDPT on three datasets: All of Us (AoU), MIMIC-III, and SyntheticMass, generating 55-73 tests per cohort across four conditions. These tests correctly identified 20-43 non-aligned or non-conforming data issues. We present a detailed analysis of the LLM-generated test suites in terms of reference grounding and value accuracy. Conclusion: Our approach incorporates external medical knowledge to enable context-sensitive data quality testing as part of the data analysis workflow to improve the validity of its outcomes. Our approach tackles these challenges from a quality assurance perspective, laying the foundation for further development such as additional data modalities and improved grounding methods.
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- Europe (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Alaska (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
Clinically Interpretable Mortality Prediction for ICU Patients with Diabetes and Atrial Fibrillation: A Machine Learning Approach
Sun, Li, Chen, Shuheng, Si, Yong, Fan, Junyi, Pishgar, Maryam, Pishgar, Elham, Alaei, Kamiar, Placencia, Greg
Background: Patients with both diabetes mellitus (DM) and atrial fibrillation (AF) face elevated mortality in intensive care units (ICUs), yet models targeting this high-risk group remain limited. Objective: To develop an interpretable machine learning (ML) model predicting 28-day mortality in ICU patients with concurrent DM and AF using early-phase clinical data. Methods: A retrospective cohort of 1,535 adult ICU patients with DM and AF was extracted from the MIMIC-IV database. Data preprocessing involved median/mode imputation, z-score normalization, and early temporal feature engineering. A two-step feature selection pipeline-univariate filtering (ANOVA F-test) and Random Forest-based multivariate ranking-yielded 19 interpretable features. Seven ML models were trained with stratified 5-fold cross-validation and SMOTE oversampling. Interpretability was assessed via ablation and Accumulated Local Effects (ALE) analysis. Results: Logistic regression achieved the best performance (AUROC: 0.825; 95% CI: 0.779-0.867), surpassing more complex models. Key predictors included RAS, age, bilirubin, and extubation. ALE plots showed intuitive, non-linear effects such as age-related risk acceleration and bilirubin thresholds. Conclusion: This interpretable ML model offers accurate risk prediction and clinical insights for early ICU triage in patients with DM and AF.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Massachusetts (0.04)
- Asia > Middle East > Israel (0.04)
- (6 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Prediction of Delirium Risk in Mild Cognitive Impairment Using Time-Series data, Machine Learning and Comorbidity Patterns -- A Retrospective Study
Ramamoorthy, Santhakumar, Rani, Priya, Mahon, James, Mathews, Glenn, Cloherty, Shaun, Babaei, Mahdi
Delirium represents a significant clinical concern characterized by high morbidity and mortality rates, particularly in patients with mild cognitive impairment (MCI). This study investigates the associated risk factors for delirium by analyzing the comorbidity patterns relevant to MCI and developing a longitudinal predictive model leveraging machine learning methodologies. A retrospective analysis utilizing the MIMIC-IV v2.2 database was performed to evaluate comorbid conditions, survival probabilities, and predictive modeling outcomes. The examination of comorbidity patterns identified distinct risk profiles for the MCI population. Kaplan-Meier survival analysis demonstrated that individuals with MCI exhibit markedly reduced survival probabilities when developing delirium compared to their non-MCI counterparts, underscoring the heightened vulnerability within this cohort. For predictive modeling, a Long Short-Term Memory (LSTM) ML network was implemented utilizing time-series data, demographic variables, Charlson Comorbidity Index (CCI) scores, and an array of comorbid conditions. The model demonstrated robust predictive capabilities with an AUROC of 0.93 and an AUPRC of 0.92. This study underscores the critical role of comorbidities in evaluating delirium risk and highlights the efficacy of time-series predictive modeling in pinpointing patients at elevated risk for delirium development.
- Asia > Bangladesh (0.04)
- North America > United States > Massachusetts (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)