Classification vs regression in overparameterized regimes: Does the loss function matter?

Muthukumar, Vidya, Narang, Adhyyan, Subramanian, Vignesh, Belkin, Mikhail, Hsu, Daniel, Sahai, Anant

May-16-2020–arXiv.org Machine Learning

Paradigmatic problems in supervised machine learning (ML) involve predicting an output response from an input, based on patterns extracted from a (training) dataset. In classification, the output response is (finitely) discrete and we need to classify input data into one of these discrete categories. In regression, the output is continuous, typically a real number or a vector. Owing to this important distinction in output response, the two tasks are typically treated differently. The differences in treatment manifest in two phases of modern ML: optimization (training), which consists of an algorithmic procedure to extract a predictor from the training data, typically by minimizing the training loss (also called empirical risk); and generalization (testing), which consists of an evaluation of the obtained predictor on a separate test, or validation, dataset. Traditionally, the choice of loss functions for both phases is starkly different across classification and regression tasks. The squared-loss function is typically used both for the training and the testing phases in regression. In contrast, the hinge or logistic (cross-entropy for multi-class problems) loss functions are typically used in the training phase of classification, while the very different 0-1 loss function is used for testing.

artificial intelligence, equation, machine learning, (18 more...)

arXiv.org Machine Learning

May-16-2020

arXiv.org PDF

Add feedback

Country:
- Europe (0.45)
- North America > United States (0.45)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Energy > Oil & Gas (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found