Scaling Up ROC-Optimizing Support Vector Machines

Nov-10-2025–arXiv.org Machine Learning

Binary classification is a fundamental problem in machine learning. Given a pair (X, Y), where X is a p-dimensional predictor and Y is a binary response taking values in { 1, 1}, the goal is to learn a decision function f of X that predicts Y by ˆ Y = sign{f(X)}. A canonical approach is to choose f that minimizes the classification error, or equivalently, maximizes the accuracy. For instance, the support vector machine (SVM; Vapnik, 1999) determines the decision function by maximizing the geometric margin, which effectively aligns with maximizing accuracy [Lin, 2002]. However, in imbalanced settings where one class is substantially underrepresented, accuracy can be a misleading measure of performance. Even a trivial classifier that always predicts the majority class can achieve high accuracy while completely failing to detect samples from the minor class. As an alternative, the receiver operating characteristic (ROC) curve is widely used to evaluate classifier performance under class imbalance. By definition, the ROC curve plots the true positive rate (TPR) against the false positive rate (FPR) to summarize classification performance, and the area under the ROC curve (AUC) serves as a popular scalar summary. A classifier with a larger AUC value is generally regarded as having better classification performance.

approximation, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

Nov-10-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.46)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (1.00)
  - Performance Analysis > Accuracy (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found