Support Vector Machines
Sex with Support Vector Machines
Moghaddam, Baback, Yang, Ming-Hsuan
Ming-Hsuan Yang University of Illinois at Urbana-Champaign Urbana, IL 61801 USA mhyang avision.ai.uiuc.edu Abstract Nonlinear Support Vector Machines (SVMs) are investigated for visual sex classification with low resolution "thumbnail" faces (21-by-12 pixels) processed from 1,755 images from the FERET face database. The performance of SVMs is shown to be superior to traditional pattern classifiers (Linear, Quadratic, Fisher Linear Discriminant, Nearest-Neighbor)as well as more modern techniques such as Radial Basis Function (RBF) classifiers and large ensemble RBF networks. Furthermore, the SVM performance (3.4% error) is currently the best result reported in the open literature. 1 Introduction In recent years, SVMs have been successfully applied to various tasks in computational face-processing.These include face detection [14], face pose discrimination [12] and face recognition [16]. Although facial sex classification has attracted much attention in the psychological literature [1, 4, 8, 15], relatively few computatinal learning methods have been proposed.
Four-legged Walking Gait Control Using a Neuromorphic Chip Interfaced to a Support Vector Learning Algorithm
Still, Susanne, Schรถlkopf, Bernhard, Hepp, Klaus, Douglas, Rodney J.
To control the walking gaits of a four-legged robot we present a novel neuromorphic VLSI chip that coordinates the relative phasing of the robot's legs similar to how spinal Central Pattern Generators are believed to control vertebrate locomotion [3]. The chip controls the leg movements bydriving motors with time varying voltages which are the outputs of a small network of coupled oscillators. The characteristics of the chip's output voltages depend on a set of input parameters. The relationship betweeninput parameters and output voltages can be computed analytically for an idealized system.
Feature Selection for SVMs
Weston, Jason, Mukherjee, Sayan, Chapelle, Olivier, Pontil, Massimiliano, Poggio, Tomaso, Vapnik, Vladimir
We introduce a method of feature selection for Support Vector Machines. The method is based upon finding those features which minimize bounds on the leave-one-out error. This search can be efficiently performed via gradient descent. The resulting algorithms are shown to be superior to some standard feature selection algorithms on both toy data and real-life problems of face recognition, pedestrian detection and analyzing DNA micro array data. 1 Introduction In many supervised learning problems feature selection is important for a variety of reasons: generalizationperformance, running time requirements, and constraints and interpretational issuesimposed by the problem itself.
Mixtures of Gaussian Processes
We introduce the mixture of Gaussian processes (MGP) model which is useful for applications in which the optimal bandwidth of a map is input dependent. The MGP is derived from the mixture of experts model and can also be used for modeling general conditional probability densities. We discuss how Gaussian processes -in particular in form of Gaussian process classification, the support vector machine and the MGP modelcan beused for quantifying the dependencies in graphical models. 1 Introduction Gaussian processes are typically used for regression where it is assumed that the underlying functionis generated by one infinite-dimensional Gaussian distribution (i.e.
A Mathematical Programming Approach to the Kernel Fisher Algorithm
Mika, Sebastian, Rรคtsch, Gunnar, Mรผller, Klaus-Robert
We investigate a new kernel-based classifier: the Kernel Fisher Discriminant (KFD).A mathematical programming formulation based on the observation thatKFD maximizes the average margin permits an interesting modification of the original KFD algorithm yielding the sparse KFD. We find that both, KFD and the proposed sparse KFD, can be understood in an unifying probabilistic context. Furthermore, we show connections to Support Vector Machines and Relevance Vector Machines. From this understanding, we are able to outline an interesting kernel-regression technique based upon the KFD algorithm.
Active Support Vector Machine Classification
Mangasarian, Olvi L., Musicant, David R.
Classificationis achieved by a linear or nonlinear separating surface in the input space of the dataset. In this work we propose a very fast simple algorithm, based on an active set strategy for solving quadratic programs with bounds [18]. The algorithm is capable of accurately solving problems with millions of points and requires nothing more complicated than a commonly available linear equation solver [17, 1, 6] for a typically small (100) dimensional input space of the problem. Key to our approach are the following two changes to the standard linear SVM: 1. Maximize the margin (distance) between the parallel separating planes with respect to both orientation (w) as well as location relative to the origin b).
Large Scale Bayes Point Machines
Herbrich, Ralf, Graepel, Thore
The concept of averaging over classifiers is fundamental to the Bayesian analysis of learning. Based on this viewpoint, it has recently beendemonstrated for linear classifiers that the centre of mass of version space (the set of all classifiers consistent with the training set) - also known as the Bayes point - exhibits excellent generalisationabilities. In this paper we present a method based on the simple perceptron learning algorithm which allows to overcome this algorithmic drawback. The method is algorithmically simpleand is easily extended to the multi-class case. We present experimental results on the MNIST data set of handwritten digitswhich show that Bayes point machines (BPMs) are competitive with the current world champion, the support vector machine.
Vicinal Risk Minimization
Chapelle, Olivier, Weston, Jason, Bottou, Lรฉon, Vapnik, Vladimir
The Vicinal Risk Minimization principle establishes a bridge between generative models and methods derived from the Structural Risk Minimization Principlesuch as Support Vector Machines or Statistical Regularization. Weexplain how VRM provides a framework which integrates a number of existing algorithms, such as Parzen windows, Support Vector Machines, Ridge Regression, Constrained Logistic Classifiers and Tangent-Prop. We then show how the approach implies new algorithms forsolving problems usually associated with generative models. New algorithms are described for dealing with pattern recognition problems with very different pattern distributions and dealing with unlabeled data. Preliminary empirical results are presented.
Incremental and Decremental Support Vector Machine Learning
Cauwenberghs, Gert, Poggio, Tomaso
An online recursive algorithm for training support vector machines, one vector at a time, is presented. Adiabatic increments retain the Kuhn Tucker conditions on all previously seen training data, in a number of steps each computed analytically. The incremental procedure is reversible, anddecremental "unlearning" offers an efficient method to exactly evaluate leave-one-out generalization performance.
A Linear Programming Approach to Novelty Detection
Campbell, Colin, Bennett, Kristin P.
Novelty detection involves modeling the normal behaviour of a system henceenabling detection of any divergence from normality. It has potential applications in many areas such as detection of machine damageor highlighting abnormal features in medical data. One approach is to build a hypothesis estimating the support of the normal data i.e. constructing a function which is positive in the region where the data is located and negative elsewhere. Recently kernel methods have been proposed for estimating the support of a distribution and they have performed well in practice - training involves solution of a quadratic programming problem. In this paper wepropose a simpler kernel method for estimating the support based on linear programming. The method is easy to implement and can learn large datasets rapidly. We demonstrate the method on medical and fault detection datasets.