Support Vector Machines
Dynamically Adapting Kernels in Support Vector Machines
Cristianini, Nello, Campbell, Colin, Shawe-Taylor, John
The kernel-parameter is one of the few tunable parameters in Support Vectormachines, controlling the complexity of the resulting hypothesis. Its choice amounts to model selection and its value is usually found by means of a validation set. We present an algorithm whichcan automatically perform model selection with little additional computational cost and with no need of a validation set. In this procedure model selection and learning are not separate, but kernels are dynamically adjusted during the learning process to find the kernel parameter which provides the best possible upper bound on the generalisation error. Theoretical results motivating the approach and experimental results confirming its validity are presented.
Regularizing AdaBoost
Rätsch, Gunnar, Onoda, Takashi, Müller, Klaus R.
We will also introduce a regularization strategy(analogous to weight decay) into boosting. This strategy uses slack variables to achieve a soft margin (section 4). Numerical experiments show the validity of our regularization approach in section 5 and finally a brief conclusion is given. 2 AdaBoost Algorithm Let {ht(x): t 1, ...,T} be an ensemble of T hypotheses defined on input vector x and e
Using Analytic QP and Sparseness to Speed Training of Support Vector Machines
SVMs have empirically been shown to give good generalization performance on a wide variety of problems. However, the use of SVMs is stilI limited to a small group of researchers. One possible reason is that training algorithms for SVMs are slow, especially for large problems. Another explanation is that SVM training algorithms are complex, subtle, and sometimes difficult to implement. This paper describes a new SVM learning algorithm that is easy to implement, often faster, and has better scaling properties than the standard SVM training algorithm. The new SVM learning algorithm is called Sequential Minimal Optimization (or SMO).
Support Vector Machines Applied to Face Recognition
On the other hand, in 804 P.J Phillips face recognition, there are many individuals (classes), and only a few images (samples) per person, and algorithms must recognize faces by extrapolating from the training samples. In numerous applications there can be only one training sample (image) of each person. Support vector machines (SVMs) are formulated to solve a classical two class pattern recognition problem. We adapt SVM to face recognition by modifying the interpretation of the output of a SVM classifier and devising a representation of facial images that is concordant with a two class problem. Traditional SVM returns a binary value, the class of the object.
Exploiting Generative Models in Discriminative Classifiers
Jaakkola, Tommi, Haussler, David
On the other hand, discriminative methods such as support vector machines enable us to construct flexible decision boundaries and often result in classification performance superiorto that of the model based approaches. An ideal classifier should combine these two complementary approaches. In this paper, we develop a natural way of achieving this combination byderiving kernel functions for use in discriminative methods such as support vector machines from generative probability models.
Semi-Supervised Support Vector Machines
Bennett, Kristin P., Demiriz, Ayhan
We introduce a semi-supervised support vector machine (S3yM) method. Given a training set of labeled data and a working set of unlabeled data, S3YM constructs a support vector machine using boththe training and working sets. We use S3 YM to solve the transduction problem using overall risk minimization (ORM) posed by Yapnik. The transduction problem is to estimate the value of a classification function at the given points in the working set. This contrasts with the standard inductive learning problem of estimating the classification function at all possible values and then using the fixed function to deduce the classes of the working set data.
Classification by Pairwise Coupling
Hastie, Trevor, Tibshirani, Robert
We discuss a strategy for polychotomous classification that involves estimating class probabilities for each pair of classes, and then coupling the estimates together. The coupling model is similar to the Bradley-Terry method for paired comparisons. We study the nature of the class probability estimates that arise, and examine the performance of the procedure in simulated datasets. The classifiers used include linear discriminants and nearest neighbors: application to support vector machines is also briefly described.
From Regularization Operators to Support Vector Kernels
Smola, Alex J., Schölkopf, Bernhard
Support Vector (SV) Machines for pattern recognition, regression estimation and operator inversion exploit the idea of transforming into a high dimensional feature space where they perform a linear algorithm. Instead of evaluating this map explicitly, one uses Hilbert Schmidt Kernels k(x, y) which correspond to dot products of the mapped data in high dimensional space, i.e. k(x, y) ( I (x) · I (y))