Support Vector Machines
Four-legged Walking Gait Control Using a Neuromorphic Chip Interfaced to a Support Vector Learning Algorithm
Still, Susanne, Schรถlkopf, Bernhard, Hepp, Klaus, Douglas, Rodney J.
To control the walking gaits of a four-legged robot we present a novel neuromorphic VLSI chip that coordinates the relative phasing of the robot's legs similar to how spinal Central Pattern Generators are believed to control vertebrate locomotion [3]. The chip controls the leg movements by driving motors with time varying voltages which are the outputs of a small network of coupled oscillators. The characteristics of the chip's output voltages depend on a set of input parameters. The relationship between input parameters and output voltages can be computed analytically for an idealized system. In practice, however, this ideal relationship is only approximately true due to transistor mismatch and offsets. Fine tuning of the chip's input parameters is done automatically by the robotic system, using an unsupervised Support Vector (SV) learning algorithm introduced recently [7]. The learning requires only that the description of the desired output is given. The machine learns from (unlabeled) examples how to set the parameters to the chip in order to obtain a desired motor behavior.
Regularized Winnow Methods
In theory, the Winnow multiplicative update has certain advantages over the Perceptron additive update when there are many irrelevant attributes. Recently, there has been much effort on enhancing the Perceptron algorithm by using regularization, leading to a class of linear classification methods called support vector machines. Similarly, it is also possible to apply the regularization idea to the Winnow algorithm, which gives methods we call regularized Winnows. We show that the resulting methods compare with the basic Winnows in a similar way that a support vector machine compares with the Perceptron. We investigate algorithmic issues and learning properties of the derived methods. Some experimental results will also be provided to illustrate different methods. 1 Introduction In this paper, we consider the binary classification problem that is to determine a label y E {-1, 1} associated with an input vector x. A useful method for solving this problem is through linear discriminant functions, which consist of linear combinations of the components of the input variable.
Feature Selection for SVMs
Weston, Jason, Mukherjee, Sayan, Chapelle, Olivier, Pontil, Massimiliano, Poggio, Tomaso, Vapnik, Vladimir
We introduce a method of feature selection for Support Vector Machines. The method is based upon finding those features which minimize bounds on the leave-one-out error. This search can be efficiently performed via gradient descent. The resulting algorithms are shown to be superior to some standard feature selection algorithms on both toy data and real-life problems of face recognition, pedestrian detection and analyzing DNA micro array data.
Mixtures of Gaussian Processes
We introduce the mixture of Gaussian processes (MGP) model which is useful for applications in which the optimal bandwidth of a map is input dependent. The MGP is derived from the mixture of experts model and can also be used for modeling general conditional probability densities. We discuss how Gaussian processes -in particular in form of Gaussian process classification, the support vector machine and the MGP modelcan be used for quantifying the dependencies in graphical models. 1 Introduction Gaussian processes are typically used for regression where it is assumed that the underlying function is generated by one infinite-dimensional Gaussian distribution (i.e.
A Mathematical Programming Approach to the Kernel Fisher Algorithm
Mika, Sebastian, Rรคtsch, Gunnar, Mรผller, Klaus-Robert
We investigate a new kernel-based classifier: the Kernel Fisher Discriminant (KFD). A mathematical programming formulation based on the observation that KFD maximizes the average margin permits an interesting modification of the original KFD algorithm yielding the sparse KFD. We find that both, KFD and the proposed sparse KFD, can be understood in an unifying probabilistic context. Furthermore, we show connections to Support Vector Machines and Relevance Vector Machines. From this understanding, we are able to outline an interesting kernel-regression technique based upon the KFD algorithm.
Active Support Vector Machine Classification
Mangasarian, Olvi L., Musicant, David R.
Classification is achieved by a linear or nonlinear separating surface in the input space of the dataset. In this work we propose a very fast simple algorithm, based on an active set strategy for solving quadratic programs with bounds [18]. The algorithm is capable of accurately solving problems with millions of points and requires nothing more complicated than a commonly available linear equation solver [17, 1, 6] for a typically small (100) dimensional input space of the problem. Key to our approach are the following two changes to the standard linear SVM: 1. Maximize the margin (distance) between the parallel separating planes with respect to both orientation (w) as well as location relative to the origin b).
Vicinal Risk Minimization
Chapelle, Olivier, Weston, Jason, Bottou, Lรฉon, Vapnik, Vladimir
The Vicinal Risk Minimization principle establishes a bridge between generative models and methods derived from the Structural Risk Minimization Principle such as Support Vector Machines or Statistical Regularization. We explain how VRM provides a framework which integrates a number of existing algorithms, such as Parzen windows, Support Vector Machines, Ridge Regression, Constrained Logistic Classifiers and Tangent-Prop. We then show how the approach implies new algorithms for solving problems usually associated with generative models. New algorithms are described for dealing with pattern recognition problems with very different pattern distributions and dealing with unlabeled data. Preliminary empirical results are presented.
Incremental and Decremental Support Vector Machine Learning
Cauwenberghs, Gert, Poggio, Tomaso
An online recursive algorithm for training support vector machines, one vector at a time, is presented. Adiabatic increments retain the Kuhn Tucker conditions on all previously seen training data, in a number of steps each computed analytically. The incremental procedure is reversible, and decremental "unlearning" offers an efficient method to exactly evaluate leave-one-out generalization performance.
A Linear Programming Approach to Novelty Detection
Campbell, Colin, Bennett, Kristin P.
Novelty detection involves modeling the normal behaviour of a system hence enabling detection of any divergence from normality. It has potential applications in many areas such as detection of machine damage or highlighting abnormal features in medical data. One approach is to build a hypothesis estimating the support of the normal data i.e. constructing a function which is positive in the region where the data is located and negative elsewhere. Recently kernel methods have been proposed for estimating the support of a distribution and they have performed well in practice - training involves solution of a quadratic programming problem. In this paper we propose a simpler kernel method for estimating the support based on linear programming. The method is easy to implement and can learn large datasets rapidly. We demonstrate the method on medical and fault detection datasets.
A Support Vector Method for Clustering
Ben-Hur, Asa, Horn, David, Siegelmann, Hava T., Vapnik, Vladimir
We present a novel method for clustering using the support vector machine approach. Data points are mapped to a high dimensional feature space, where support vectors are used to define a sphere enclosing them. The boundary of the sphere forms in data space a set of closed contours containing the data. Data points enclosed by each contour are defined as a cluster. As the width parameter of the Gaussian kernel is decreased, these contours fit the data more tightly and splitting of contours occurs.