The Brier Score under Administrative Censoring: Problems and Solutions

Kvamme, Håvard, Borgan, Ørnulf

arXiv.org Machine Learning 

Box 1053 Blindern 0316 Oslo, Norway Abstract The Brier score is commonly used for evaluating probability predictions. In survival analysis, with right-censored observations of the event times, this score can be weighted by the inverse probability of censoring (IPCW) to retain its original interpretation. It is common practice to estimate the censoring distribution with the Kaplan-Meier estimator, even though it assumes that the censoring distribution is independent of the covariates. This paper discusses the general impact of the censoring estimates on the Brier score and shows that the estimation of the censoring distribution can be problematic. In particular, when the censoring times can be identified from the covariates, the IPCW score is no longer valid. For administratively censored data, where the potential censoring times are known for all individuals, we propose an alternative version of the Brier score. This administrative Brier score does not require estimation of the censoring distribution and is valid even if the censoring times can be identified from the covariates. Keywords: survival analysis, time-to-event-prediction, customer churn, inverse probability weighting, progressive type I censoring 1. Introduction Recently, there has been an increasing interest in combining machine learning methodology with survival analysis for improved time-to-event prediction. Also worth mentioning is the Random Survival Forest (Ishwaran et al., 2008) which makes decision trees based on the log-rank test and estimates the cumulative hazards with the Nelson-Aalen estimator. Although these methods are available for right-censored event times, a substantial part of the machine learning community is not familiar with survival analysis and might find it reasonable to instead apply binary classifiers for time-to-event prediction. In short, a binary classifier estimates the probability that an individual experience the event by time t, and can be fitted by disregarding individuals censored before that time. Arguably, the two most common evaluation criteria for survival predictions are the inverse probability of censoring weighted (IPCW) Brier score (Graf et al., 1999; Gerds and Schumacher, 2006) and different versions of the concordance index (Harrell Jr et al., 1982; Antolini et al., 2005; Uno et al., 2011; Gerds et al., 2013).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found