Support Vector Machines
Online Classification on a Budget
Crammer, Koby, Kandola, Jaz, Singer, Yoram
Online algorithms for classification often require vast amounts of memory and computation time when employed in conjunction with kernel functions. In this paper we describe and analyze a simple approach for an on-the-fly reduction of the number of past examples used for prediction. Experiments performed with real datasets show that using the proposed algorithmic approach with a single epoch is competitive with the support vector machine (SVM) although the latter, being a batch algorithm, accesses each training example multiple times.
Margin Maximizing Loss Functions
Rosset, Saharon, Zhu, Ji, Hastie, Trevor J.
Margin maximizing properties play an important role in the analysis of classiยฃcation models,such as boosting and support vector machines. Margin maximization is theoretically interesting because it facilitates generalization error analysis, and practically interesting because it presents a clear geometric interpretation of the models being built. We formulate and prove a sufยฃcient condition for the solutions of regularized loss functions to converge to margin maximizing separators, asthe regularization vanishes. This condition covers the hinge loss of SVM, the exponential loss of AdaBoost and logistic regression loss. We also generalize it to multi-class classiยฃcation problems, and present margin maximizing multiclass versionsof logistic regression and support vector machines.
Towards Social Robots: Automatic Evaluation of Human-Robot Interaction by Facial Expression Classification
Littlewort, G.C., Bartlett, M.S., Fasel, I.R., Chenu, J., Kanda, T., Ishiguro, H., Movellan, J.R.
Computer animated agents and robots bring a social dimension to human computer interaction and force us to think in new ways about how computers could be used in daily life. Face to face communication is a real-time process operating at a time scale of less than a second. In this paper we present progress on a perceptual primitive to automatically detect frontal faces in the video stream and code them with respect to 7 dimensions in real time: neutral, anger, disgust, fear, joy, sadness, surprise. The face finder employs a cascade of feature detectors trained with boosting techniques [13, 2]. The expression recognizer employs a novel combination of Adaboost and SVM's. The generalization performance to new subjects for a 7-way forced choice was 93.3% and 97% correct on two publicly available datasets. The outputs of the classifier change smoothly as a function of time, providing a potentially valuable representation to code facial expression dynamics in a fully automatic and unobtrusive manner. The system was deployed and evaluated for measuring spontaneous facial expressions in the field in an application for automatic assessment of human-robot interaction.
Phonetic Speaker Recognition with Support Vector Machines
Campbell, William M., Campbell, Joseph P., Reynolds, Douglas A., Jones, Douglas A., Leek, Timothy R.
A recent area of significant progress in speaker recognition is the use of high level features--idiolect, phonetic relations, prosody, discourse structure, etc. A speaker not only has a distinctive acoustic sound but uses language in a characteristic manner. Large corpora of speech data available in recent years allow experimentation with long term statistics of phone patterns, word patterns, etc. of an individual. We propose the use of support vector machines and term frequency analysis of phone sequences to model a given speaker. To this end, we explore techniques for text categorization applied to the problem. We derive a new kernel based upon a linearization of likelihood ratio scoring. We introduce a new phone-based SVM speaker recognition approach that halves the error rate of conventional phone-based approaches.
Margin Maximizing Loss Functions
Rosset, Saharon, Zhu, Ji, Hastie, Trevor J.
Margin maximizing properties play an important role in the analysis of classiยฃcation models, such as boosting and support vector machines. Margin maximization is theoretically interesting because it facilitates generalization error analysis, and practically interesting because it presents a clear geometric interpretation of the models being built. We formulate and prove a sufยฃcient condition for the solutions of regularized loss functions to converge to margin maximizing separators, as the regularization vanishes. This condition covers the hinge loss of SVM, the exponential loss of AdaBoost and logistic regression loss. We also generalize it to multi-class classiยฃcation problems, and present margin maximizing multiclass versions of logistic regression and support vector machines.
Approximate Analytical Bootstrap Averages for Support Vector Classifiers
Malzahn, Dรถrthe, Opper, Manfred
We compute approximate analytical bootstrap averages for support vector classification using a combination of the replica method of statistical physics and the TAP approach for approximate inference. We test our method on a few datasets and compare it with exact averages obtained by extensive Monte-Carlo sampling.
Large Margin Classifiers: Convex Loss, Low Noise, and Convergence Rates
Bartlett, Peter L., Jordan, Michael I., Mcauliffe, Jon D.
Many classification algorithms, including the support vector machine, boosting and logistic regression, can be viewed as minimum contrast methods that minimize a convex surrogate of the 0-1 loss function. We characterize the statistical consequences of using such a surrogate by providing a general quantitative relationship between the risk as assessed using the 0-1 loss and the risk as assessed using any nonnegative surrogate loss function. We show that this relationship gives nontrivial bounds under the weakest possible condition on the loss function--that it satisfy a pointwise form of Fisher consistency for classification. The relationship is based on a variational transformation of the loss function that is easy to compute in many applications. We also present a refined version of this result in the case of low noise. Finally, we present applications of our results to the estimation of convergence rates in the general setting of function classes that are scaled hulls of a finite-dimensional base class.
Sparseness of Support Vector Machines---Some Asymptotically Sharp Bounds
The decision functions constructed by support vector machines (SVM's) usually depend only on a subset of the training set--the so-called support vectors. We derive asymptotically sharp lower and upper bounds on the number of support vectors for several standard types of SVM's. In particular, we show for the Gaussian RBF kernel that the fraction of support vectors tends to twice the Bayes risk for the L1-SVM, to the probability of noise for the L2-SVM, and to 1 for the LS-SVM.
Nonlinear Filtering of Electron Micrographs by Means of Support Vector Regression
Vollgraf, Roland, Scholz, Michael, Meinertzhagen, Ian A., Obermayer, Klaus
Nonlinear filtering can solve very complex problems, but typically involve very time consuming calculations. Here we show that for filters that are constructed as a RBF network with Gaussian basis functions, a decomposition into linear filters exists, which can be computed efficiently in the frequency domain, yielding dramatic improvement in speed. We present an application of this idea to image processing. In electron micrograph images of photoreceptor terminals of the fruit fly, Drosophila, synaptic vesicles containing neurotransmitter should be detected and labeled automatically. We use hand labels, provided by human experts, to learn a RBF filter using Support Vector Regression with Gaussian kernels. We will show that the resulting nonlinear filter solves the task to a degree of accuracy, which is close to what can be achieved by human experts. This allows the very time consuming task of data evaluation to be done efficiently.