Support Vector Machines
An Auditory Paradigm for Brain-Computer Interfaces
Hill, N. J., Lal, Thomas N., Bierig, Karin, Birbaumer, Niels, Schölkopf, Bernhard
Motivated by the particular problems involved in communicating with "locked-in" paralysed patients, we aim to develop a braincomputer interface that uses auditory stimuli. We describe a paradigm that allows a user to make a binary decision by focusing attention on one of two concurrent auditory stimulus sequences. Using Support Vector Machine classification and Recursive Channel Elimination on the independent components of averaged eventrelated potentials, we show that an untrained user's EEG data can be classified with an encouragingly high level of accuracy. This suggests that it is possible for users to modulate EEG signals in a single trial by the conscious direction of attention, well enough to be useful in BCI.
Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning
Zhu, Jerry, Kandola, Jaz, Ghahramani, Zoubin, Lafferty, John D.
We present an algorithm based on convex optimization for constructing kernels for semi-supervised learning. The kernel matrices are derived from the spectral decomposition of graph Laplacians, and combine labeled and unlabeled data in a systematic fashion. Unlike previous work using diffusion kernels and Gaussian random field kernels, a nonparametric kernel approach is presented that incorporates order constraints during optimization. This results in flexible kernels and avoids the need to choose among different parametric forms. Our approach relies on a quadratically constrained quadratic program (QCQP), and is computationally feasible for large datasets. We evaluate the kernels on real datasets using support vector machines, with encouraging results.
Class-size Independent Generalization Analsysis of Some Discriminative Multi-Category Classification
We consider the problem of deriving class-size independent generalization bounds for some regularized discriminative multi-category classification methods. In particular, we obtain an expected generalization bound for a standard formulation of multi-category support vector machines. Based on the theoretical result, we argue that the formulation over-penalizes misclassification error, which in theory may lead to poor generalization performance. A remedy, based on a generalization of multi-category logistic regression (conditional maximum entropy), is then proposed, and its theoretical properties are examined.
Machine Learning Applied to Perception: Decision Images for Gender Classification
Wichmann, Felix A., Graf, Arnulf B., Bülthoff, Heinrich H., Simoncelli, Eero P., Schölkopf, Bernhard
We study gender discrimination of human faces using a combination of psychophysical classification and discrimination experiments together with methods from machine learning. We reduce the dimensionality of a set of face images using principal component analysis, and then train a set of linear classifiers on this reduced representation (linear support vector machines (SVMs), relevance vector machines (RVMs), Fisher linear discriminant (FLD), and prototype (prot) classifiers) using human classification data. Because we combine a linear preprocessor with linear classifiers, the entire system acts as a linear classifier, allowing us to visualise the decision-image corresponding to the normal vector of the separating hyperplanes (SH) of each classifier. We predict that the female-tomaleness transition along the normal vector for classifiers closely mimicking human classification (SVM and RVM [1]) should be faster than the transition along any other direction. A psychophysical discrimination experiment using the decision images as stimuli is consistent with this prediction.
Fast Rates to Bayes for Kernel Machines
Steinwart, Ingo, Scovel, Clint
We establish learning rates to the Bayes risk for support vector machines (SVMs) with hinge loss. In particular, for SVMs with Gaussian RBF kernels we propose a geometric condition for distributions which can be used to determine approximation properties of these kernels. Finally, we compare our methods with a recent paper of G. Blanchard et al..
Density Level Detection is Classification
Steinwart, Ingo, Hush, Don, Scovel, Clint
We show that anomaly detection can be interpreted as a binary classification problem. Using this interpretation we propose a support vector machine (SVM) for anomaly detection. We then present some theoretical results which include consistency and learning rates. Finally, we experimentally compare our SVM with the standard one-class SVM.
A Temporal Kernel-Based Model for Tracking Hand Movements from Neural Activities
Shpigelman, Lavi, Crammer, Koby, Paz, Rony, Vaadia, Eilon, Singer, Yoram
We devise and experiment with a dynamical kernel-based system for tracking hand movements from neural activity. The state of the system corresponds to the hand location, velocity, and acceleration, while the system's input are the instantaneous spike rates. The system's state dynamics is defined as a combination of a linear mapping from the previous estimated state and a kernel-based mapping tailored for modeling neural activities. In contrast to generative models, the activity-to-state mapping is learned using discriminative methods by minimizing a noise-robust loss function. We use this approach to predict hand trajectories on the basis of neural activity in motor cortex of behaving monkeys and find that the proposed approach is more accurate than both a static approach based on support vector regression and the Kalman filter.
A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound
A novel linear feature selection algorithm is presented based on the global minimization of a data-dependent generalization error bound. Feature selection and scaling algorithms often lead to non-convex optimization problems, which in many previous approaches were addressed through gradient descent procedures that can only guarantee convergence to a local minimum. We propose an alternative approach, whereby the global solution of the non-convex optimization problem is derived via an equivalent optimization problem. Moreover, the convex optimization task is reduced to a conic quadratic programming problem for which efficient solvers are available. Highly competitive numerical results on both artificial and real-world data sets are reported.
A Topographic Support Vector Machine: Classification Using Local Label Configurations
Mohr, Johannes, Obermayer, Klaus
The standard approach to the classification of objects is to consider the examples as independent and identically distributed (iid). In many real world settings, however, this assumption is not valid, because a topographical relationship exists between the objects. In this contribution we consider the special case of image segmentation, where the objects are pixels and where the underlying topography is a 2D regular rectangular grid. We introduce a classification method which not only uses measured vectorial feature information but also the label configuration within a topographic neighborhood. Due to the resulting dependence between the labels of neighboring pixels, a collective classification of a set of pixels becomes necessary. We propose a new method called'Topographic Support Vector Machine' (TSVM), which is based on a topographic kernel and a self-consistent solution to the label assignment shown to be equivalent to a recurrent neural network. The performance of the algorithm is compared to a conventional SVM on a cell image segmentation task.