Goto

Collaborating Authors

 feature selection and case-based reasoning


Feature Selection and Case-Based Reasoning for Survival Analysis in Bioinformatics

AAAI Conferences

The development of microarray technology has made it possible to assemble biomedical datasets that measure the expression profile of thousands of genes simultaneously. However, such high-dimensional datasets make computation costly and can complicate the interpretation of a predictive model. To address this, feature selection methods are used to extract biological information from a large amount of data in order to filter the expression dataset down to the smallest possible subset of accurate predictor genes. Feature selection has three main advantages: it decreases computational costs, mitigates the possibility of overfitting due to high inter-variable correlations, and allows for an easier clinical interpretation of the model. In this paper we compare three methods of feature selection: iterative Bayesian Model Averaging (BMA), Random Survival Forest (RSF) and Cox Proportional Hazard (CPH) and five methods of survival analysis: Analysis RandomSurvival Forest (RSF), Cox Proportional Hazard (CPH), Alan Additive Filter (AAF), DeepSurv (neural network), andCbrSurv (case-based reasoning), which we introduce in this paper. Features selected by these methods are compared with a hand selected set of features. All the data we used came from the Metabric breast cancer dataset. Our results indicate that feature selection improves the performance of survival analysis methods. Overall, the best survival analysis performance was obtained by combining RSF for feature selection and CbrSurv, closely followed by DeepSurv, for survival prediction