Optimal Decision-Theoretic Classification Using Non-Decomposable Performance Metrics
Natarajan, Nagarajan, Koyejo, Oluwasanmi, Ravikumar, Pradeep, Dhillon, Inderjit S.
In contrast, decomposable metrics such as accuracy evaluated on set of examples can be decomposed into a sum of per-example accuracies. Non-decomposability of a performance metric is often desirable as it enables a nonlinear tradeoff between the overall confusion matrix entries: true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN). As a result, non-decomposable performance metrics remain popular for imbalanced and rare event classification in medical diagnosis, fraud detection, information retrieval applications [Lewis and Gale, 1994, Drummond and Holte, 2005, Gu et al., 2009, He and Garcia, 2009], and in other problems where the practitioner is interested in measuring tradeoffs beyond standard classification accuracy. A recent flurry of theoretical results and practical algorithms highlights a growing interest in understanding and optimizing non-decomposable metrics [Dembczynski et al., 2011, Ye et al., 2012, Koyejo et al., 2014, Narasimhan et al., 2014]. Existing theoretical analysis has focused on two distinct approaches for characterizing the population version of the non-decomposable metrics: identified by Ye et al. [2012] as decision theoretic analysis (DTA) and empirical utility maximization (EUM). DTA population utilities measure the expected gain of a classifier on a fixed-size test set, while EUM population utilities are a function 1 of the population confusion matrix. In other words, DTA population utilities measure the the average utility over an infinite set of test sets, each of a fixed size, while EUM population utilities evaluate the performance of a classifier over a single infinitely large test set. It has recently been shown that for EUM based population utilities, the optimal classifier for large classes of non-decomposable binary classification metrics is just the sign of the thresholded conditional probability of the positive class with a metric-dependent threshold [Koyejo et al., 2014, Narasimhan et al., 2014]. In addition, practical algorithms have been proposed for such EUM consistent classification based on direct optimization for the threshold on a held-out validation set.
May-7-2015
- Country:
- North America > United States (0.28)
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Health & Medicine (0.34)
- Technology: