AITopics | survival curve

2604.03502

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Syria > Aleppo Governorate > Aleppo (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(2 more...)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

arXiv.org Machine LearningFeb-17-2026

Efficient and Debiased Learning of Average Hazard Under Non-Proportional Hazards

Meng, Xiang, Tian, Lu, Kehl, Kenneth, Uno, Hajime

The hazard ratio from the Cox proportional hazards model is a ubiquitous summary of treatment effect. However, when hazards are non-proportional, the hazard ratio can lose a stable causal interpretation and become study-dependent because it effectively averages time-varying effects with weights determined by follow-up and censoring. We consider the average hazard (AH) as an alternative causal estimand: a population-level person-time event rate that remains well-defined and interpretable without assuming proportional hazards. Although AH can be estimated nonparametrically and regression-style adjustments have been proposed, existing approaches do not provide a general framework for flexible, high-dimensional nuisance estimation with valid sqrt{n} inference. We address this gap by developing a semiparametric, doubly robust framework for covariate-adjusted AH. We establish pathwise differentiability of AH in the nonparametric model, derive its efficient influence function, and construct cross-fitted, debiased estimators that leverage machine learning for nuisance estimation while retaining asymptotically normal, sqrt{n}-consistent inference under mild product-rate conditions. Simulations demonstrate that the proposed estimator achieves small bias and near-nominal confidence-interval coverage across proportional and non-proportional hazards settings, including crossing-hazards regimes where Cox-based summaries can be unstable. We illustrate practical utility in comparative effectiveness research by comparing immunotherapy regimens for advanced melanoma using SEER-Medicare linked data.

artificial intelligence, hazard, machine learning, (17 more...)

2602.13475

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services > Reimbursement (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Villanueva, Nora M., Sestelo, Marta, Meira-Machado, Luis

Efficient and scalable clustering of survival curves

arXiv.org Machine LearningDec-19-2025

Survival analysis encompasses a broad range of methods for analyzing time-to-event data, with one key objective being the comparison of survival curves across groups. Traditional approaches for identifying clusters of survival curves often rely on computationally intensive bootstrap techniques to approximate the null hypothesis distribution. While effective, these methods impose significant computational burdens. In this work, we propose a novel approach that leverages the k-means and log-rank test to efficiently identify and cluster survival curves. Our method eliminates the need for computationally expensive resampling, significantly reducing processing time while maintaining statistical reliability. By systematically evaluating survival curves and determining optimal clusters, the proposed method ensures a practical and scalable alternative for large-scale survival data analysis. Through simulation studies, we demonstrate that our approach achieves results comparable to existing bootstrap-based clustering methods while dramatically improving computational efficiency. These findings suggest that the log-rank-based clustering procedure offers a viable and time-efficient solution for researchers working with multiple survival curves in medical and epidemiological studies.

correction, procedure, survival curve, (14 more...)

2512.16481

Country:

Europe > Netherlands > South Holland > Rotterdam (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Spain > Galicia > A Coruña Province > Santiago de Compostela (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

Cardoso, Lucas Buk, Angelo, Simone Aldrey, Bonilha, Yasmin Pacheco Gil, Maia, Fernando, Ribeiro, Adeylson Guimarães, Curado, Maria Paula, Fernandes, Gisele Aparecida, Parro, Vanderlei Cunha, Cipparrone, Flávio Almeida de Magalhães, Filho, Alexandre Dias Porto Chiavegatto, Filho, Victor Wünsch, Toporcov, Tatiana Natasha

Methodology for Comparing Machine Learning Algorithms for Survival Analysis

arXiv.org Artificial IntelligenceDec-2-2025

This study presents a comparative methodological analysis of six machine learning models for survival analysis (MLSA). Using data from nearly 45,000 colorectal cancer patients in the Hospital-Based Cancer Registries of São Paulo, we evaluated Random Survival Forest (RSF), Gradient Boosting for Survival Analysis (GBSA), Survival SVM (SSVM), XGBoost-Cox (XGB-Cox), XGBoost-AFT (XGB-AFT), and LightGBM (LGBM), capable of predicting survival considering censored data. Hyperparameter optimization was performed with different samplers, and model performance was assessed using the Concordance Index (C-Index), C-Index IPCW, time-dependent AUC, and Integrated Brier Score (IBS). Survival curves produced by the models were compared with predictions from classification algorithms, and predictor interpretation was conducted using SHAP and permutation importance. XGB-AFT achieved the best performance (C-Index = 0.7618; IPCW = 0.7532), followed by GBSA and RSF. The results highlight the potential and applicability of MLSA to improve survival prediction and support decision making.

artificial intelligence, machine learning, survival analysis, (16 more...)

2510.24473

Country: South America > Brazil > São Paulo (0.25)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology > Colorectal Cancer (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsNov-21-2025, 11:34:05 GMT

Deep Multi-task Gaussian Processes for Survival Analysis with Competing Risks

Designing optimal treatment plans for patients with comorbidities requires accurate cause-specific mortality prognosis. Motivated by the recent availability of linked electronic health records, we develop a nonparametric Bayesian model for survival analysis with competing risks, which can be used for jointly assessing a patient's risk of multiple (competing) adverse outcomes. The model views a patient's survival times with respect to the competing risks as the outputs of a deep multi-task Gaussian process (DMGP), the inputs to which are the patients' covari-ates. Unlike parametric survival analysis methods based on Cox and Weibull models, our model uses DMGPs to capture complex non-linear interactions between the patients' covariates and cause-specific survival times, thereby learning flexible patient-specific and cause-specific survival curves, all in a data-driven fashion without explicit parametric assumptions on the hazard rates. We propose a varia-tional inference algorithm that is capable of learning the model parameters from time-to-event data while handling right censoring. Experiments on synthetic and real data show that our model outperforms the state-of-the-art survival models.

artificial intelligence, machine learning, survival time, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Nephrology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Tejedor, Guillaume, Peralta, Veronika, Labroche, Nicolas, Marcel, Patrick, Blasco, Hélène, Alarcan, Hugo

Learning a Distance for the Clustering of Patients with Amyotrophic Lateral Sclerosis

arXiv.org Artificial IntelligenceNov-5-2025

Amyotrophic lateral sclerosis (ALS) is a severe disease with a typical survival of 3-5 years after symptom onset. Current treatments offer only limited life extension, and the variability in patient responses highlights the need for personalized care. However, research is hindered by small, heterogeneous cohorts, sparse longitudinal data, and the lack of a clear definition for clinically meaningful patient clusters. Existing clustering methods remain limited in both scope and number. To address this, we propose a clustering approach that groups sequences using a disease progression declarative score. Our approach integrates medical expertise through multiple descriptive variables, investigating several distance measures combining such variables, both by reusing off-the-shelf distances and employing a weak-supervised learning method. We pair these distances with clustering methods and benchmark them against state-of-the-art techniques. The evaluation of our approach on a dataset of 353 ALS patients from the University Hospital of Tours, shows that our method outperforms state-of-the-art methods in survival analysis while achieving comparable silhouette scores. In addition, the learned distances enhance the relevance and interpretability of results for medical experts.

artificial intelligence, machine learning, sequence, (18 more...)

2511.01945

Country: Europe (0.28)

Genre:

Research Report > Promising Solution (0.69)
Research Report > Experimental Study (0.69)
Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology > Amyotrophic Lateral Sclerosis (ALS) (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Schlender, Thalea, Romme, Catharina J. A., van der Linden, Yvette M., van Lonkhuijzen, Luc R. C. W., Bosman, Peter A. N., Alderliesten, Tanja

PISA: An AI Pipeline for Interpretable-by-design Survival Analysis Providing Multiple Complexity-Accuracy Trade-off Models

arXiv.org Artificial IntelligenceSep-30-2025

Survival analysis is central to clinical research, informing patient prognoses, guiding treatment decisions, and optimising resource allocation. Accurate time-to-event predictions not only improve quality of life but also reveal risk factors that shape clinical practice. For these models to be relevant in healthcare, interpretability is critical: predictions must be traceable to patient-specific characteristics, and risk factors should be identifiable to generate actionable insights for both clinicians and researchers. Traditional survival models often fail to capture non-linear interactions, while modern deep learning approaches, though powerful, are limited by poor interpretability. We propose a Pipeline for Interpretable Survival Analysis (PISA) - a pipeline that provides multiple survival analysis models that trade off complexity and performance. Using multiple-feature, multi-objective feature engineering, PISA transforms patient characteristics and time-to-event data into multiple survival analysis models, providing valuable insights into the survival prediction task. Crucially, every model is converted into simple patient stratification flowcharts supported by Kaplan-Meier curves, whilst not compromising on performance. While PISA is model-agnostic, we illustrate its flexibility through applications of Cox regression and shallow survival trees, the latter avoiding proportional hazards assumptions. Applied to two clinical benchmark datasets, PISA produced interpretable survival models and intuitive stratification flowcharts whilst achieving state-of-the-art performances. Revisiting a prior departmental study further demonstrated its capacity to automate survival analysis workflows in real-world clinical research.

artificial intelligence, machine learning, survival analysis model, (16 more...)

2509.22673

Country: Europe > Netherlands (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

arXiv.org Machine LearningSep-24-2025

KM-GPT: An Automated Pipeline for Reconstructing Individual Patient Data from Kaplan-Meier Plots

Zhao, Yao, Sun, Haoyue, Ding, Yantian, Xu, Yanxun

Reconstructing individual patient data (IPD) from Kaplan-Meier (KM) plots provides valuable insights for evidence synthesis in clinical research. However, existing approaches often rely on manual digitization, which is error-prone and lacks scalability. To address these limitations, we develop KM-GPT, the first fully automated, AI-powered pipeline for reconstructing IPD directly from KM plots with high accuracy, robustness, and reproducibility. KM-GPT integrates advanced image preprocessing, multi-modal reasoning powered by GPT-5, and iterative reconstruction algorithms to generate high-quality IPD without manual input or intervention. Its hybrid reasoning architecture automates the conversion of unstructured information into structured data flows and validates data extraction from complex KM plots. To improve accessibility, KM-GPT is equipped with a user-friendly web interface and an integrated AI assistant, enabling researchers to reconstruct IPD without requiring programming expertise. KM-GPT was rigorously evaluated on synthetic and real-world datasets, consistently demonstrating superior accuracy. To illustrate its utility, we applied KM-GPT to a meta-analysis of gastric cancer immunotherapy trials, reconstructing IPD to facilitate evidence synthesis and biomarker-based subgroup analyses. By automating traditionally manual processes and providing a scalable, web-based solution, KM-GPT transforms clinical research by leveraging reconstructed IPD to enable more informed downstream analyses, supporting evidence-based decision-making.

km plot, km-gpt, survival curve, (16 more...)

2509.18141

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Veeraragavan, Narasimha Raghavan, Nygård, Jan Franz

Federated Survival Analysis with Node-Level Differential Privacy: Private Kaplan-Meier Curves

arXiv.org Artificial IntelligenceSep-3-2025

We investigate how to calculate Kaplan-Meier survival curves across multiple health-care jurisdictions while protecting patient privacy with node-level differential privacy. Each site discloses its curve only once, adding Laplace noise whose scale is determined by the length of the common time grid; the server then averages the noisy curves, so the overall privacy budget remains unchanged. We benchmark four one-shot smoothing techniques: Discrete Cosine Transform, Haar Wavelet shrinkage, adaptive Total-Variation denoising, and a parametric Weibull fit on the NCCTG lung-cancer cohort under five privacy levels and three partition scenarios (uniform, moderately skewed, highly imbalanced). Total-Variation gives the best mean accuracy, whereas the frequency-domain smoothers offer stronger worst-case robustness and the Weibull model shows the most stable behaviour at the strictest privacy setting. Across all methods the released curves keep the empirical log-rank type-I error below fifteen percent for privacy budgets of 0.5 and higher, demonstrating that clinically useful survival information can be shared without iterative training or heavy cryptography.

artificial intelligence, machine learning, privacy, (17 more...)

2509.00615

Country:

Europe > Norway > Eastern Norway > Oslo (0.04)
North America > United States (0.04)
Europe > Norway > Northern Norway > Troms > Tromsø (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.89)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.83)

arXiv.org Artificial IntelligenceJul-17-2025

Targeted Deep Architectures: A TMLE-Based Framework for Robust Causal Inference in Neural Networks

Li, Yi, Mccoy, David, Gunter, Nolan, Lee, Kaitlyn, Schuler, Alejandro, van der Laan, Mark

Modern deep neural networks are powerful predictive tools yet often lack valid inference for causal parameters, such as treatment effects or entire survival curves. While frameworks like Double Machine Learning (DML) and Targeted Maximum Likelihood Estimation (TMLE) can debias machine-learning fits, existing neural implementations either rely on "targeted losses" that do not guarantee solving the efficient influence function equation or computationally expensive post-hoc "fluctuations" for multi-parameter settings. We propose Targeted Deep Architectures (TDA), a new framework that embeds TMLE directly into the network's parameter space with no restrictions on the backbone architecture. Specifically, TDA partitions model parameters - freezing all but a small "targeting" subset - and iteratively updates them along a targeting gradient, derived from projecting the influence functions onto the span of the gradients of the loss with respect to weights. This procedure yields plug-in estimates that remove first-order bias and produce asymptotically valid confidence intervals. Crucially, TDA easily extends to multi-dimensional causal estimands (e.g., entire survival curves) by merging separate targeting gradients into a single universal targeting update. Theoretically, TDA inherits classical TMLE properties, including double robustness and semiparametric efficiency. Empirically, on the benchmark IHDP dataset (average treatment effects) and simulated survival data with informative censoring, TDA reduces bias and improves coverage relative to both standard neural-network estimators and prior post-hoc approaches. In doing so, TDA establishes a direct, scalable pathway toward rigorous causal inference within modern deep architectures for complex multi-parameter targets.

artificial intelligence, machine learning, targ, (19 more...)

2507.12435

Genre: Research Report > Experimental Study (0.68)

Industry:

Health & Medicine (0.46)
Law > Civil Rights & Constitutional Law (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)