Goto

Collaborating Authors

 test cohort


Multimodal MRI-Ultrasound AI for Prostate Cancer Detection Outperforms Radiologist MRI Interpretation: A Multi-Center Study

Jahanandish, Hassan, Sang, Shengtian, Li, Cynthia Xinran, Vesal, Sulaiman, Bhattacharya, Indrani, Lee, Jeong Hoon, Fan, Richard, Sonna, Geoffrey A., Rusu, Mirabela

arXiv.org Artificial Intelligence

Pre-biopsy magnetic resonance imaging (MRI) is increasingly used to target suspicious prostate lesions. This has led to artificial intelligence (AI) applications improving MRI-based detection of clinically significant prostate cancer (CsPCa). However, MRI-detected lesions must still be mapped to transrectal ultrasound (TRUS) images during biopsy, which results in missing CsPCa. This study systematically evaluates a multimodal AI framework integrating MRI and TRUS image sequences to enhance CsPCa identification. The study included 3110 patients from three cohorts across two institutions who underwent prostate biopsy. The proposed framework, based on the 3D UNet architecture, was evaluated on 1700 test cases, comparing performance to unimodal AI models that use either MRI or TRUS alone. Additionally, the proposed model was compared to radiologists in a cohort of 110 patients. The multimodal AI approach achieved superior sensitivity (80%) and Lesion Dice (42%) compared to unimodal MRI (73%, 30%) and TRUS models (49%, 27%). Compared to radiologists, the multimodal model showed higher specificity (88% vs. 78%) and Lesion Dice (38% vs. 33%), with equivalent sensitivity (79%). Our findings demonstrate the potential of multimodal AI to improve CsPCa lesion targeting during biopsy and treatment planning, surpassing current unimodal models and radiologists; ultimately improving outcomes for prostate cancer patients.


Institutional-Level Monitoring of Immune Checkpoint Inhibitor IrAEs Using a Novel Natural Language Processing Algorithmic Pipeline

Shapiro, Michael, Dor, Herut, Gurevich-Shapiro, Anna, Etan, Tal, Wolf, Ido

arXiv.org Artificial Intelligence

Background: Immune checkpoint inhibitors (ICIs) have revolutionized cancer treatment but can result in severe immune-related adverse events (IrAEs). Monitoring IrAEs on a large scale is essential for personalized risk profiling and assisting in treatment decisions. Methods: In this study, we conducted an analysis of clinical notes from patients who received ICIs at the Tel Aviv Sourasky Medical Center. By employing a Natural Language Processing algorithmic pipeline, we systematically identified seven common or severe IrAEs. We examined the utilization of corticosteroids, treatment discontinuation rates following IrAEs, and constructed survival curves to visualize the occurrence of adverse events during treatment. Results: Our analysis encompassed 108,280 clinical notes associated with 1,635 patients who had undergone ICI therapy. The detected incidence of IrAEs was consistent with previous reports, exhibiting substantial variation across different ICIs. Treatment with corticosteroids varied depending on the specific IrAE, ranging from 17.3% for thyroiditis to 57.4% for myocarditis. Our algorithm demonstrated high accuracy in identifying IrAEs, as indicated by an area under the curve (AUC) of 0.89 for each suspected note and F1 scores of 0.87 or higher for five out of the seven IrAEs examined at the patient level. Conclusions: This study presents a novel, large-scale monitoring approach utilizing deep neural networks for IrAEs. Our method provides accurate results, enhancing understanding of detrimental consequences experienced by ICI-treated patients. Moreover, it holds potential for monitoring other medications, enabling comprehensive post-marketing surveillance to identify susceptible populations and establish personalized drug safety profiles.


Acute kidney injury prediction for non-critical care patients: a retrospective external and internal validation study

Adiyeke, Esra, Ren, Yuanfang, Shickel, Benjamin, Ruppert, Matthew M., Guan, Ziyuan, Kane-Gill, Sandra L., Murugan, Raghavan, Amatullah, Nabihah, Stottlemyer, Britney A., Tran, Tiffany L., Ricketts, Dan, Horvat, Christopher M, Rashidi, Parisa, Bihorac, Azra, Ozrazgat-Baslanti, Tezcan

arXiv.org Artificial Intelligence

Background: Acute kidney injury (AKI), the decline of kidney excretory function, occurs in up to 18% of hospitalized admissions. Progression of AKI may lead to irreversible kidney damage. Methods: This retrospective cohort study includes adult patients admitted to a non-intensive care unit at the University of Pittsburgh Medical Center (UPMC) (n = 46,815) and University of Florida Health (UFH) (n = 127,202). We developed and compared deep learning and conventional machine learning models to predict progression to Stage 2 or higher AKI within the next 48 hours. We trained local models for each site (UFH Model trained on UFH, UPMC Model trained on UPMC) and a separate model with a development cohort of patients from both sites (UFH-UPMC Model). We internally and externally validated the models on each site and performed subgroup analyses across sex and race. Results: Stage 2 or higher AKI occurred in 3% (n=3,257) and 8% (n=2,296) of UFH and UPMC patients, respectively. Area under the receiver operating curve values (AUROC) for the UFH test cohort ranged between 0.77 (UPMC Model) and 0.81 (UFH Model), while AUROC values ranged between 0.79 (UFH Model) and 0.83 (UPMC Model) for the UPMC test cohort. UFH-UPMC Model achieved an AUROC of 0.81 (95% confidence interval [CI] [0.80, 0.83]) for UFH and 0.82 (95% CI [0.81,0.84]) for UPMC test cohorts; an area under the precision recall curve values (AUPRC) of 0.6 (95% CI, [0.05, 0.06]) for UFH and 0.13 (95% CI, [0.11,0.15]) for UPMC test cohorts. Kinetic estimated glomerular filtration rate, nephrotoxic drug burden and blood urea nitrogen remained the top three features with the highest influence across the models and health centers. Conclusion: Locally developed models displayed marginally reduced discrimination when tested on another institution, while the top set of influencing features remained the same across the models and sites.


A Comparative Analysis of Machine Learning Models for Early Detection of Hospital-Acquired Infections

Harvey, Ethan, Dong, Junzi, Ghosh, Erina, Samadani, Ali

arXiv.org Artificial Intelligence

As more and more infection-specific machine learning models are developed and planned for clinical deployment, simultaneously running predictions from different models may provide overlapping or even conflicting information. It is important to understand the concordance and behavior of parallel models in deployment. In this study, we focus on two models for the early detection of hospital-acquired infections (HAIs): 1) the Infection Risk Index (IRI) and 2) the Ventilator-Associated Pneumonia (VAP) prediction model. The IRI model was built to predict all HAIs, whereas the VAP model identifies patients at risk of developing ventilator-associated pneumonia. These models could make important improvements in patient outcomes and hospital management of infections through early detection of infections and in turn, enable early interventions. The two models vary in terms of infection label definition, cohort selection, and prediction schema. In this work, we present a comparative analysis between the two models to characterize concordances and confusions in predicting HAIs by these models. The learnings from this study will provide important findings for how to deploy multiple concurrent disease-specific models in the future.


The Potential of Wearable Sensors for Assessing Patient Acuity in Intensive Care Unit (ICU)

Sena, Jessica, Mostafiz, Mohammad Tahsin, Zhang, Jiaqing, Davidson, Andrea, Bandyopadhyay, Sabyasachi, Yuanfang, Ren, Ozrazgat-Baslanti, Tezcan, Shickel, Benjamin, Loftus, Tyler, Schwartz, William Robson, Bihorac, Azra, Rashidi, Parisa

arXiv.org Artificial Intelligence

Acuity assessments are vital in critical care settings to provide timely interventions and fair resource allocation. Traditional acuity scores rely on manual assessments and documentation of physiological states, which can be time-consuming, intermittent, and difficult to use for healthcare providers. Furthermore, such scores do not incorporate granular information such as patients' mobility level, which can indicate recovery or deterioration in the ICU. We hypothesized that existing acuity scores could be potentially improved by employing Artificial Intelligence (AI) techniques in conjunction with Electronic Health Records (EHR) and wearable sensor data. In this study, we evaluated the impact of integrating mobility data collected from wrist-worn accelerometers with clinical data obtained from EHR for developing an AI-driven acuity assessment score. Accelerometry data were collected from 86 patients wearing accelerometers on their wrists in an academic hospital setting. The data was analyzed using five deep neural network models: VGG, ResNet, MobileNet, SqueezeNet, and a custom Transformer network. These models outperformed a rule-based clinical score (SOFA= Sequential Organ Failure Assessment) used as a baseline, particularly regarding the precision, sensitivity, and F1 score. The results showed that while a model relying solely on accelerometer data achieved limited performance (AUC 0.50, Precision 0.61, and F1-score 0.68), including demographic information with the accelerometer data led to a notable enhancement in performance (AUC 0.69, Precision 0.75, and F1-score 0.67). This work shows that the combination of mobility and patient information can successfully differentiate between stable and unstable states in critically ill patients.


Learning under random distributional shifts

Bansak, Kirk, Paulson, Elisabeth, Rothenhäusler, Dominik

arXiv.org Machine Learning

In various real-world settings, however, we might expect shifts to arise through the superposition of many small and random changes in the population and environment. Thus, we consider a class of random distribution shift models that capture arbitrary changes in the underlying covariate space, and dense, random shocks to the relationship between the covariates and the outcomes. In this setting, we characterize the benefits and drawbacks of several alternative prediction strategies: the standard approach that directly predicts the long-term outcomes of interest, the proxy approach that directly predicts a shorter-term proxy outcome, and a hybrid approach that utilizes both the long-term policy outcome and (shorter-term) proxy outcome(s). We show that the hybrid approach is robust to the strength of the distribution shift and the proxy relationship. We apply this method to datasets in two high-impact domains: asylum-seeker resettlement and early childhood education. In both settings, we find that the proposed approach results in substantially lower mean-squared error than current approaches.


Conformalized semi-supervised random forest for classification and abnormality detection

Han, Yujin, Xu, Mingwenchan, Guan, Leying

arXiv.org Artificial Intelligence

Traditional classifiers infer labels under the premise that the training and test samples are generated from the same distribution. This assumption can be problematic for safety-critical applications such as medical diagnosis and network attack detection. In this paper, we consider the multi-class classification problem when the training data and the test data may have different distributions. We propose conformalized semi-supervised random forest (CSForest), which constructs set-valued predictions $C(x)$ to include the correct class label with desired probability while detecting outliers efficiently. We compare the proposed method to other state-of-art methods in both a synthetic example and a real data application to demonstrate the strength of our proposal.


Development of Machine Learning Models to Predict Probabilities and Types of Stroke at Prehospital Stage: the Japan Urgent Stroke Triage Score Using Machine Learning (JUST-ML) - PubMed

#artificialintelligence

In conjunction with recent advancements in machine learning (ML), such technologies have been applied in various fields owing to their high predictive performance. We tried to develop prehospital stroke scale with ML. We conducted multi-center retrospective and prospective cohort study. The training cohort had eight centers in Japan from June 2015 to March 2018, and the test cohort had 13 centers from April 2019 to March 2020. We use the three different ML algorithms (logistic regression, random forests, XGBoost) to develop models.