to

### Survival Forests under Test: Impact of the Proportional Hazards Assumption on Prognostic and Predictive Forests for ALS Survival

We investigate the effect of the proportional hazards assumption on prognostic and predictive models of the survival time of patients suffering from amyotrophic lateral sclerosis (ALS). We theoretically compare the underlying model formulations of several variants of survival forests and implementations thereof, including random forests for survival, conditional inference forests, Ranger, and survival forests with $L_1$ splitting, with two novel variants, namely distributional and transformation survival forests. Theoretical considerations explain the low power of log-rank-based splitting in detecting patterns in non-proportional hazards situations in survival trees and corresponding forests. This limitation can potentially be overcome by the alternative split procedures suggested herein. We empirically investigated this effect using simulation experiments and a re-analysis of the PRO-ACT database of ALS survival, giving special emphasis to both prognostic and predictive models.

### A Simple Discrete-Time Survival Model for Neural Networks

There is currently great interest in applying neural networks to prediction tasks in medicine. It is important for predictive models to be able to use survival data, where each patient has a known follow-up time and event/censoring indicator. This avoids information loss when training the model and enables generation of predicted survival curves. In this paper, we describe a discrete-time survival model that is designed to be used with neural networks. The model is trained with the maximum likelihood method using minibatch stochastic gradient descent (SGD). The use of SGD enables rapid training speed. The model is flexible, so that the baseline hazard rate and the effect of the input data can vary with follow-up time. It has been implemented in the Keras deep learning framework, and source code for the model and several examples is available online. We demonstrated the high performance of the model by using it as part of a convolutional neural network to predict survival for over 10,000 patients with metastatic cancer, using the full text of 1,137,317 provider notes. The model's C-index on the validation set was 0.71, which was superior to a linear baseline model (C-index 0.69).

### On Ranking in Survival Analysis: Bounds on the Concordance Index

In this paper, we show that classical survival analysis involving censored data can naturally be cast as a ranking problem. The concordance index (CI), which quantifies the quality of rankings, is the standard performance measure for model \emph{assessment} in survival analysis. In contrast, the standard approach to \emph{learning} the popular proportional hazard (PH) model is based on Cox's partial likelihood. In this paper we devise two bounds on CI--one of which emerges directly from the properties of PH models--and optimize them \emph{directly}. Our experimental results suggest that both methods perform about equally well, with our new approach giving slightly better results than the Cox's method. We also explain why a method designed to maximize the Cox's partial likelihood also ends up (approximately) maximizing the CI.

### DeepSurv: Personalized Treatment Recommender System Using A Cox Proportional Hazards Deep Neural Network

Medical practitioners use survival models to explore and understand the relationships between patients' covariates (e.g. clinical and genetic features) and the effectiveness of various treatment options. Standard survival models like the linear Cox proportional hazards model require extensive feature engineering or prior medical knowledge to model treatment interaction at an individual level. While nonlinear survival methods, such as neural networks and survival forests, can inherently model these high-level interaction terms, they have yet to be shown as effective treatment recommender systems. We introduce DeepSurv, a Cox proportional hazards deep neural network and state-of-the-art survival method for modeling interactions between a patient's covariates and treatment effectiveness in order to provide personalized treatment recommendations. We perform a number of experiments training DeepSurv on simulated and real survival data. We demonstrate that DeepSurv performs as well as or better than other state-of-the-art survival models and validate that DeepSurv successfully models increasingly complex relationships between a patient's covariates and their risk of failure. We then show how DeepSurv models the relationship between a patient's features and effectiveness of different treatment options to show how DeepSurv can be used to provide individual treatment recommendations. Finally, we train DeepSurv on real clinical studies to demonstrate how it's personalized treatment recommendations would increase the survival time of a set of patients. The predictive and modeling capabilities of DeepSurv will enable medical researchers to use deep neural networks as a tool in their exploration, understanding, and prediction of the effects of a patient's characteristics on their risk of failure.

### Feature Selection and Case-Based Reasoning for Survival Analysis in Bioinformatics

The development of microarray technology has made it possible to assemble biomedical datasets that measure the expression profile of thousands of genes simultaneously. However, such high-dimensional datasets make computation costly and can complicate the interpretation of a predictive model. To address this, feature selection methods are used to extract biological information from a large amount of data in order to filter the expression dataset down to the smallest possible subset of accurate predictor genes. Feature selection has three main advantages: it decreases computational costs, mitigates the possibility of overfitting due to high inter-variable correlations, and allows for an easier clinical interpretation of the model. In this paper we compare three methods of feature selection: iterative Bayesian Model Averaging (BMA), Random Survival Forest (RSF) and Cox Proportional Hazard (CPH) and five methods of survival analysis: Analysis RandomSurvival Forest (RSF), Cox Proportional Hazard (CPH), Alan Additive Filter (AAF), DeepSurv (neural network), andCbrSurv (case-based reasoning), which we introduce in this paper. Features selected by these methods are compared with a hand selected set of features. All the data we used came from the Metabric breast cancer dataset. Our results indicate that feature selection improves the performance of survival analysis methods. Overall, the best survival analysis performance was obtained by combining RSF for feature selection and CbrSurv, closely followed by DeepSurv, for survival prediction