AITopics

We describe an algorithm for support vector machines (SVM) that can be parallelized efficiently and scales to very large problems with hundreds of thousands of training vectors. Instead of analyzing the whole training set in one optimization step, the data are split into subsets and optimized separately with multiple SVMs. The partial results are combined and filtered again in a'Cascade' of SVMs, until the global optimum is reached. The Cascade SVM can be spread over multiple processors with minimal communication overhead and requires far less memory, since the kernel matrices are much smaller than for a regular SVM. Convergence to the global optimum is guaranteed with multiple passes th rough the Cascade, but already a single pass provides good generalization. A single pass is 5x - 10x faster than a regular SVM for problems of 100,000 vectors when implemented on a single processor. Parallel implementations on a cluster of 16 processors were tested with over 1 million vectors (2-class problems), converging in a day or two, while a regular SVM never converged in over a week.

cascade, support vector, vector, (16 more...)

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
Europe > Germany > Saxony > Leipzig (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Zhu, Jerry, Kandola, Jaz, Ghahramani, Zoubin, Lafferty, John D.

Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning

We present an algorithm based on convex optimization for constructing kernels for semi-supervised learning. The kernel matrices are derived from the spectral decomposition of graph Laplacians, and combine labeled and unlabeled data in a systematic fashion. Unlike previous work using diffusion kernels and Gaussian random field kernels, a nonparametric kernel approach is presented that incorporates order constraints during optimization. This results in flexible kernels and avoids the need to choose among different parametric forms. Our approach relies on a quadratically constrained quadratic program (QCQP), and is computationally feasible for large datasets. We evaluate the kernels on real datasets using support vector machines, with encouraging results.

constraint, kernel, spectral transformation, (14 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)

Class-size Independent Generalization Analsysis of Some Discriminative Multi-Category Classification

Zhang, Tong

We consider the problem of deriving class-size independent generalization bounds for some regularized discriminative multi-category classification methods. In particular, we obtain an expected generalization bound for a standard formulation of multi-category support vector machines. Based on the theoretical result, we argue that the formulation over-penalizes misclassification error, which in theory may lead to poor generalization performance. A remedy, based on a generalization of multi-category logistic regression (conditional maximum entropy), is then proposed, and its theoretical properties are examined.

classification error, gen, generalization, (11 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.71)

Steinwart, Ingo, Scovel, Clint

Fast Rates to Bayes for Kernel Machines

We establish learning rates to the Bayes risk for support vector machines (SVMs) with hinge loss. In particular, for SVMs with Gaussian RBF kernels we propose a geometric condition for distributions which can be used to determine approximation properties of these kernels. Finally, we compare our methods with a recent paper of G. Blanchard et al..

exponent, kernel, svm, (15 more...)

Country:

North America > United States > New York (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hong Kong (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.31)

Steinwart, Ingo, Hush, Don, Scovel, Clint

Density Level Detection is Classification

We show that anomaly detection can be interpreted as a binary classification problem. Using this interpretation we propose a support vector machine (SVM) for anomaly detection. We then present some theoretical results which include consistency and learning rates. Finally, we experimentally compare our SVM with the standard one-class SVM.

algorithm, classification problem, support vector machine, (14 more...)

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.05)
North America > United States > New York (0.04)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.57)

A Temporal Kernel-Based Model for Tracking Hand Movements from Neural Activities

Shpigelman, Lavi, Crammer, Koby, Paz, Rony, Vaadia, Eilon, Singer, Yoram

We devise and experiment with a dynamical kernel-based system for tracking hand movements from neural activity. The state of the system corresponds to the hand location, velocity, and acceleration, while the system's input are the instantaneous spike rates. The system's state dynamics is defined as a combination of a linear mapping from the previous estimated state and a kernel-based mapping tailored for modeling neural activities. In contrast to generative models, the activity-to-state mapping is learned using discriminative methods by minimizing a noise-robust loss function. We use this approach to predict hand trajectories on the basis of neural activity in motor cortex of behaving monkeys and find that the proposed approach is more accurate than both a static approach based on support vector regression and the Kalman filter.

algorithm, spike count, trajectory, (16 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Giesen, Joachim, Spalinger, Simon, Schölkopf, Bernhard

Kernel Methods for Implicit Surface Modeling

We describe methods for computing an implicit model of a hypersurface that is given only by a finite sampling. The methods work by mapping the sample points into a reproducing kernel Hilbert space and then determining regions in terms of hyperplanes.

kernel expansion, objective function, sample point, (15 more...)

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > New York (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.31)

A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound

Peleg, Dori, Meir, Ron

A novel linear feature selection algorithm is presented based on the global minimization of a data-dependent generalization error bound. Feature selection and scaling algorithms often lead to non-convex optimization problems, which in many previous approaches were addressed through gradient descent procedures that can only guarantee convergence to a local minimum. We propose an alternative approach, whereby the global solution of the non-convex optimization problem is derived via an equivalent optimization problem. Moreover, the convex optimization task is reduced to a conic quadratic programming problem for which efficient solvers are available. Highly competitive numerical results on both artificial and real-world data sets are reported.

algorithm, dataset, optimization problem, (14 more...)

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)

Mohr, Johannes, Obermayer, Klaus

A Topographic Support Vector Machine: Classification Using Local Label Configurations

The standard approach to the classification of objects is to consider the examples as independent and identically distributed (iid). In many real world settings, however, this assumption is not valid, because a topographical relationship exists between the objects. In this contribution we consider the special case of image segmentation, where the objects are pixels and where the underlying topography is a 2D regular rectangular grid. We introduce a classification method which not only uses measured vectorial feature information but also the label configuration within a topographic neighborhood. Due to the resulting dependence between the labels of neighboring pixels, a collective classification of a set of pixels becomes necessary. We propose a new method called'Topographic Support Vector Machine' (TSVM), which is based on a topographic kernel and a self-consistent solution to the label assignment shown to be equivalent to a recurrent neural network. The performance of the algorithm is compared to a conventional SVM on a cell image segmentation task.

classification, label configuration, vector, (11 more...)

Country:

Oceania > Australia > Queensland (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Europe > Germany > Berlin (0.05)
North America > United States > New York (0.04)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Koltchinskii, Vladimir, Martínez-ramón, Manel, Posse, Stefan

Optimal Aggregation of Classifiers and Boosting Maps in Functional Magnetic Resonance Imaging

We study a method of optimal data-driven aggregation of classifiers in a convex combination and establish tight upper bounds on its excess risk with respect to a convex loss function under the assumption that the solution of optimal aggregation problem is sparse. We use a boosting type algorithm of optimal aggregation to develop aggregate classifiers of activation patterns in fMRI based on locally trained SVM classifiers. The aggregation coefficients are then used to design a "boosting map" of the brain needed to identify the regions with most significant impact on classification.

aggregate classifier, aggregation, classifier, (15 more...)