autoscore-survival
SHAPoint: Task-Agnostic, Efficient, and Interpretable Point-Based Risk Scoring via Shapley Values
Meirman, Tomer D., Shapira, Bracha, Dagan, Noa, Rokach, Lior S.
Interpretable risk scores play a vital role in clinical decision support, yet traditional methods for deriving such scores often rely on manual preprocessing, task-specific modeling, and simplified assumptions that limit their flexibility and predictive power. We present SHAPoint, a novel, task-agnostic framework that integrates the predictive accuracy of gradient boosted trees with the interpretability of point-based risk scores. SHAPoint supports classification, regression, and survival tasks, while also inheriting valuable properties from tree-based models, such as native handling of missing data and support for monotonic constraints. Compared to existing frameworks, SHAPoint offers superior flexibility, reduced reliance on manual preprocessing, and faster runtime performance. Empirical results show that SHAPoint produces compact and interpretable scores with predictive performance comparable to state-of-the-art methods, but at a fraction of the runtime, making it a powerful tool for transparent and scalable risk stratification.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Developing Federated Time-to-Event Scores Using Heterogeneous Real-World Survival Data
Li, Siqi, Shang, Yuqing, Wang, Ziwen, Wu, Qiming, Hong, Chuan, Ning, Yilin, Miao, Di, Ong, Marcus Eng Hock, Chakraborty, Bibhas, Liu, Nan
Survival analysis serves as a fundamental component in numerous healthcare applications, where the determination of the time to specific events (such as the onset of a certain disease or death) for patients is crucial for clinical decision-making. Scoring systems are widely used for swift and efficient risk prediction. However, existing methods for constructing survival scores presume that data originates from a single source, posing privacy challenges in collaborations with multiple data owners. We propose a novel framework for building federated scoring systems for multi-site survival outcomes, ensuring both privacy and communication efficiency. We applied our approach to sites with heterogeneous survival data originating from emergency departments in Singapore and the United States. Additionally, we independently developed local scores at each site. In testing datasets from each participant site, our proposed federated scoring system consistently outperformed all local models, evidenced by higher integrated area under the receiver operating characteristic curve (iAUC) values, with a maximum improvement of 11.6%. Additionally, the federated score's time-dependent AUC(t) values showed advantages over local scores, exhibiting narrower confidence intervals (CIs) across most time points. The model developed through our proposed method exhibits effective performance on each local site, signifying noteworthy implications for healthcare research. Sites participating in our proposed federated scoring model training gained benefits by acquiring survival models with enhanced prediction accuracy and efficiency. This study demonstrates the effectiveness of our privacy-preserving federated survival score generation framework and its applicability to real-world heterogeneous survival data.
- Asia > Singapore > Central Region > Singapore (0.05)
- North America > United States > North Carolina > Durham County > Durham (0.04)
- North America > United States > New York (0.04)
- Europe > Italy > Tuscany (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Survival modeling using deep learning, machine learning and statistical methods: A comparative analysis for predicting mortality after hospital admission
Wang, Ziwen, Lee, Jin Wee, Chakraborty, Tanujit, Ning, Yilin, Liu, Mingxuan, Xie, Feng, Ong, Marcus Eng Hock, Liu, Nan
Survival analysis is essential for studying time-to-event outcomes and providing a dynamic understanding of the probability of an event occurring over time. Various survival analysis techniques, from traditional statistical models to state-of-the-art machine learning algorithms, support healthcare intervention and policy decisions. However, there remains ongoing discussion about their comparative performance. We conducted a comparative study of several survival analysis methods, including Cox proportional hazards (CoxPH), stepwise CoxPH, elastic net penalized Cox model, Random Survival Forests (RSF), Gradient Boosting machine (GBM) learning, AutoScore-Survival, DeepSurv, time-dependent Cox model based on neural network (CoxTime), and DeepHit survival neural network. We applied the concordance index (C-index) for model goodness-of-fit, and integral Brier scores (IBS) for calibration, and considered the model interpretability. As a case study, we performed a retrospective analysis of patients admitted through the emergency department of a tertiary hospital from 2017 to 2019, predicting 90-day all-cause mortality based on patient demographics, clinicopathological features, and historical data. The results of the C-index indicate that deep learning achieved comparable performance, with DeepSurv producing the best discrimination (DeepSurv: 0.893; CoxTime: 0.892; DeepHit: 0.891). The calibration of DeepSurv (IBS: 0.041) performed the best, followed by RSF (IBS: 0.042) and GBM (IBS: 0.0421), all using the full variables. Moreover, AutoScore-Survival, using a minimal variable subset, is easy to interpret, and can achieve good discrimination and calibration (C-index: 0.867; IBS: 0.044). While all models were satisfactory, DeepSurv exhibited the best discrimination and calibration. In addition, AutoScore-Survival offers a more parsimonious model and excellent interpretability.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Asia > Singapore > Central Region > Singapore (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Southeast Asia (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Research Report > Strength Medium (0.88)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Health & Medicine > Health Care Providers & Services (1.00)
AutoScore-Survival: Developing interpretable machine learning-based time-to-event scores with right-censored survival data
Xie, Feng, Ning, Yilin, Yuan, Han, Goldstein, Benjamin Alan, Ong, Marcus Eng Hock, Liu, Nan, Chakraborty, Bibhas
Scoring systems are highly interpretable and widely used to evaluate time-to-event outcomes in healthcare research. However, existing time-to-event scores are predominantly created ad-hoc using a few manually selected variables based on clinician's knowledge, suggesting an unmet need for a robust and efficient generic score-generating method. AutoScore was previously developed as an interpretable machine learning score generator, integrated both machine learning and point-based scores in the strong discriminability and accessibility. We have further extended it to time-to-event data and developed AutoScore-Survival, for automatically generating time-to-event scores with right-censored survival data. Random survival forest provides an efficient solution for selecting variables, and Cox regression was used for score weighting. We illustrated our method in a real-life study of 90-day mortality of patients in intensive care units and compared its performance with survival models (i.e., Cox) and the random survival forest. The AutoScore-Survival-derived scoring model was more parsimonious than survival models built using traditional variable selection methods (e.g., penalized likelihood approach and stepwise variable selection), and its performance was comparable to survival models using the same set of variables. Although AutoScore-Survival achieved a comparable integrated area under the curve of 0.782 (95% CI: 0.767-0.794), the integer-valued time-to-event scores generated are favorable in clinical applications because they are easier to compute and interpret. Our proposed AutoScore-Survival provides an automated, robust and easy-to-use machine learning-based clinical score generator to studies of time-to-event outcomes. It provides a systematic guideline to facilitate the future development of time-to-event scores for clinical applications.
- Asia > Singapore > Central Region > Singapore (0.05)
- Asia > Middle East > Iran (0.04)
- North America > United States > North Carolina > Durham County > Durham (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Health Care Providers & Services (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)