Multiclass classification by sparse multinomial logistic regression
Abramovich, Felix, Grinshtein, Vadim, Levy, Tomer
Classification is one of the core problems in statistical learning and has been intensively studied in statistical and machine learning literature. Nevertheless, while the theory for binary classification is well developed (see, Devroy, Gyöfri and Lugosi, 1996; Vapnik, 2000; Boucheron, Bousquet and Lugosi, 2005 and references therein for a comprehensive review), its multiclass extensions are much less complete. Consider a general L-class classification with a (high-dimensional) vector of features X X R d and the outcome class label Y {1,..., L}. We can model it as Y (X x) Mult(p 1 (x),..., p L (x)), where p l (x) P (Y l X x), l 1,..., L. A classifier is a measurable function η: X {1,..., L}. The accuracy of a classifier η is defined by a misclassification error R(η) P (Y η(x)). The optimal classifier that minimizes this error is the Bayes classifier η (x) arg max 1 l L p l (x) with R(η) 1 E X max 1 l L p l (x). The probabilities p l (x)'s are, however, unknown and one should derive a classifier η(x) from the available data D: a random sample of n independent observations (X 1, Y 1),..., (X n, Y n) from the joint distribution of (X, Y). The corresponding (conditional) misclassification error of η is R( η) P (Y η(x) D) and the goodness of η w.r.t.
Mar-4-2020
- Country:
- Asia > Middle East
- Israel > Tel Aviv District
- Tel Aviv (0.04)
- Jordan (0.04)
- Israel > Tel Aviv District
- North America > United States
- New York (0.04)
- Asia > Middle East
- Genre:
- Overview (0.66)
- Research Report (0.68)
- Industry:
- Education (0.34)