stroke prediction
Optimizing Stroke Risk Prediction: A Machine Learning Pipeline Combining ROS-Balanced Ensembles and XAI
Akib, A S M Ahsanul Sarkar, Khawla, Raduana, Hasib, Abdul
Stroke is a major cause of death and permanent impairment, making it a major worldwide health concern. For prompt intervention and successful preventative tactics, early risk assessment is essential. To address this challenge, we used ensemble modeling and explainable AI (XAI) techniques to create an interpretable machine learning framework for stroke risk prediction. A thorough evaluation of 10 different machine learning models using 5-fold cross-validation across several datasets was part of our all-inclusive strategy, which also included feature engineering and data pretreatment (using Random Over-Sampling (ROS) to solve class imbalance). Our optimized ensemble model (Random Forest + ExtraTrees + XGBoost) performed exceptionally well, obtaining a strong 99.09% accuracy on the Stroke Prediction Dataset (SPD). We improved the model's transparency and clinical applicability by identifying three important clinical variables using LIME-based interpretability analysis: age, hypertension, and glucose levels. Through early prediction, this study highlights how combining ensemble learning with explainable AI (XAI) can deliver highly accurate and interpretable stroke risk assessment. By enabling data-driven prevention and personalized clinical decisions, our framework has the potential to transform stroke prediction and cardiovascular risk management.
- Asia > Singapore (0.04)
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
- Europe > Switzerland (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Advancing Tabular Stroke Modelling Through a Novel Hybrid Architecture and Feature-Selection Synergy
Islam, Yousuf, Chowdhury, Md. Jalal Uddin, Das, Sumon Chandra
Brain stroke remains one of the principal causes of death and disability worldwide, yet most tabular-data prediction models still hover below the 95% accuracy threshold, limiting real-world utility. Addressing this gap, the present work develops and validates a completely data-driven and interpretable machine-learning framework designed to predict strokes using ten routinely gathered demographic, lifestyle, and clinical variables sourced from a public cohort of 4,981 records. We employ a detailed exploratory data analysis (EDA) to understand the dataset's structure and distribution, followed by rigorous data preprocessing, including handling missing values, outlier removal, and class imbalance correction using Synthetic Minority Over-sampling Technique (SMOTE). To streamline feature selection, point-biserial correlation and random-forest Gini importance were utilized, and ten varied algorithms-encompassing tree ensembles, boosting, kernel methods, and a multilayer neural network-were optimized using stratified five-fold cross-validation. Their predictions based on probabilities helped us build the proposed model, which included Random Forest, XGBoost, LightGBM, and a support-vector classifier, with logistic regression acting as a meta-learner. The proposed model achieved an accuracy rate of 97.2% and an F1-score of 97.15%, indicating a significant enhancement compared to the leading individual model, LightGBM, which had an accuracy of 91.4%. Our study's findings indicate that rigorous preprocessing, coupled with a diverse hybrid model, can convert low-cost tabular data into a nearly clinical-grade stroke-risk assessment tool.
- Asia > India > NCT > New Delhi (0.04)
- Asia > India > NCT > Delhi (0.04)
- Asia > Bangladesh (0.04)
- North America > United States (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Therapeutic Area > Hematology (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Stroke Prediction using Clinical and Social Features in Machine Learning
Every year in the United States, 800,000 individuals suffer a stroke - one person every 40 seconds, with a death occurring every four minutes. While individual factors vary, certain predictors are more prevalent in determining stroke risk. As strokes are the second leading cause of death and disability worldwide, predicting stroke likelihood based on lifestyle factors is crucial. Showing individuals their stroke risk could motivate lifestyle changes, and machine learning offers solutions to this prediction challenge. Neural networks excel at predicting outcomes based on training features like lifestyle factors, however, they're not the only option. Logistic regression models can also effectively compute the likelihood of binary outcomes based on independent variables, making them well-suited for stroke prediction. This analysis will compare both neural networks (dense and convolutional) and logistic regression models for stroke prediction, examining their pros, cons, and differences to develop the most effective predictor that minimizes false negatives.
- North America > United States > Virginia > Montgomery County > Blacksburg (0.04)
- Europe > United Kingdom > England (0.04)
- Research Report > New Finding (0.75)
- Research Report > Experimental Study (0.75)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Predicting Stroke through Retinal Graphs and Multimodal Self-supervised Learning
Huang, Yuqing, Wittmann, Bastian, Demler, Olga, Menze, Bjoern, Davoudi, Neda
Early identification of stroke is crucial for intervention, requiring reliable models. We proposed an efficient retinal image representation together with clinical information to capture a comprehensive overview of cardiovascular health, leveraging large multimodal datasets for new medical insights. Our approach is one of the first contrastive frameworks that integrates graph and tabular data, using vessel graphs derived from retinal images for efficient representation. This method, combined with multimodal contrastive learning, significantly enhances stroke prediction accuracy by integrating data from multiple sources and using contrastive learning for transfer learning. The self-supervised learning techniques employed allow the model to learn effectively from unlabeled data, reducing the dependency on large annotated datasets. Our framework showed an AUROC improvement of 3.78% from supervised to self-supervised approaches. Additionally, the graph-level representation approach achieved superior performance to image encoders while significantly reducing pre-training and fine-tuning runtimes. These findings indicate that retinal images are a cost-effective method for improving cardiovascular disease predictions and pave the way for future research into retinal and cerebral vessel connections and the use of graph-based retinal vessel representations.
- Europe > Switzerland > Zürich > Zürich (0.15)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > United Kingdom (0.04)
- Asia > Singapore (0.04)
Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment
Ma, Danqing, Wang, Meng, Xiang, Ao, Qi, Zongqing, Yang, Qin
This study proposes a multi-modal fusion framework Multitrans based on the Transformer architecture and self-attention mechanism. This architecture combines the study of non-contrast computed tomography (NCCT) images and discharge diagnosis reports of patients undergoing stroke treatment, using a variety of methods based on Transformer architecture approach to predicting functional outcomes of stroke treatment. The results show that the performance of single-modal text classification is significantly better than single-modal image classification, but the effect of multi-modal combination is better than any single modality. Although the Transformer model only performs worse on imaging data, when combined with clinical meta-diagnostic information, both can learn better complementary information and make good contributions to accurately predicting stroke treatment effects..
- North America > United States > New Jersey > Hudson County > Hoboken (0.05)
- North America > United States > New York (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Therapeutic Area > Hematology (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Stroke prediction with Machine Learning ...
Stroke is among the most common and dangerous misdiagnosed medical conditions, and timely detection is key to effective management. Patients who are treated within an hour of the onset of symptoms have a greater chance of surviving and avoiding long-term brain damage. Data indicates that Blacks, Hispanics, women, older adults on Medicare and residents of rural areas are less likely to be diagnosed during this crucial window. Currently used pre-hospital stroke scales miss approximately 30 percent of cases.
From Conception to Deployment: Intelligent Stroke Prediction Framework using Machine Learning and Performance Evaluation
Ismail, Leila, Materwala, Huned
Stroke is the second leading cause of death worldwide. Machine learning classification algorithms have been widely adopted for stroke prediction. However, these algorithms were evaluated using different datasets and evaluation metrics. Moreover, there is no comprehensive framework for stroke data analytics. This paper proposes an intelligent stroke prediction framework based on a critical examination of machine learning prediction algorithms in the literature. The five most used machine learning algorithms for stroke prediction are evaluated using a unified setup for objective comparison. Comparative analysis and numerical results reveal that the Random Forest algorithm is best suited for stroke prediction.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- North America > United States > California > Orange County > Irvine (0.04)
- Asia > Thailand (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Providers & Services (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.94)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.47)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Machine Learning Performance Analysis to Predict Stroke Based on Imbalanced Medical Dataset
Cerebral stroke, the second most substantial cause of death universally, has been a primary public health concern over the last few years. With the help of machine learning techniques, early detection of various stroke alerts is accessible, which can efficiently prevent or diminish the stroke. Medical datasets, however, are frequently unbalanced in their class label, with a tendency to poorly predict minority classes. In this paper, the potential risk factors for stroke are investigated. Moreover, four distinctive approaches are applied to improve the classification of the minority class in the imbalanced stroke dataset, which are the ensemble weight voting classifier, the Synthetic Minority Over-sampling Technique (SMOTE), Principal Component Analysis with K-Means Clustering (PCA-Kmeans), Focal Loss with the Deep Neural Network (DNN) and compare their performance. Through the analysis results, SMOTE and PCA-Kmeans with DNN-Focal Loss work best for the limited size of a large severe imbalanced dataset (e.g., Stroke dataset), which is 2-4 times outperform Kaggle's work.
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > Switzerland (0.04)
- Asia > China (0.04)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.96)
- Health & Medicine > Therapeutic Area > Neurology (0.68)
- Health & Medicine > Consumer Health (0.66)
Top 5 techniques for Explainable AI
As you can see that all these explainable AI techniques are not "nice-to-have", but mandatory. Using these techniques will help you better communicate with the person impacted through AI decisions. In some cases, as seen in the stroke prediction example, understanding these techniques can help improve or save lives. You can experience some of the techniques in this article on my website -- https://experiencedatascience.com