Confusion Matrices and Accuracy Statistics for Binary Classifiers Using Unlabeled Data: The Diagnostic Test Approach

Dec-27-2022–arXiv.org Artificial Intelligence

Sometimes it is important to know the accuracy of a classifier on unlabeled data. The labels may be delayed, as in consumer purchasing predictions, or obtaining the labels is cost prohibitive. The labels may not exist, as for some medical conditions, for which the true gold standard diagnostic test(a 100% sensitive and 100% specific classifier) would require subjects be euthanized and autopsied to obtain labels. Epidemiologists and biostatisticians have developed statistical methods for assessing the sensitivity (Se) and specificity (Sp) of diagnostic tests when gold standard comparison tests are unavailable. In data science terms, the diagnostic test assessment data are unlabeled. In this article, I describe how to modify those diagnostic test statistical methods to estimate confusion matrices and accuracy statistics for binary classifiers.

artificial intelligence, classifier, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Dec-27-2022

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.40)

Industry:
- Health & Medicine > Health Care Technology (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Performance Analysis > Accuracy (1.00)
  - Learning Graphical Models > Directed Networks
    - Bayesian Learning (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found