Comparing AUCs of Machine Learning Models with DeLong's Test

Feb-19-2020, 02:26:00 GMT–#artificialintelligence

Have you ever wondered how to demonstrate that one machine learning model's test set performance differs significantly from the test set performance of an alternative model? This post will describe how to use DeLong's test to obtain a p-value for whether one model has a significantly different AUC than another model, where AUC refers to the area under the receiver operating characteristic. This post includes a hand-calculated example to illustrate all the steps in DeLong's test for a small data set. It also includes an example R implementation of DeLong's test to enable efficient calculation on large data sets. An example use case for DeLong's test: Model A predicts heart disease risk with AUC of 0.92, and Model B predicts heart disease risk with AUC of 0.87, and we use DeLong's test to demonstrate that Model A has a significantly different AUC from Model B with p 0.05.

auc, delong, empirical auc, (16 more...)

#artificialintelligence

Feb-19-2020, 02:26:00 GMT

News Web Page

Add feedback

Genre:
- Research Report > Experimental Study (0.58)

Industry:
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.86)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found