Support Vector Machines
Bayesian Nonlinear Support Vector Machines and Discriminative Factor Modeling
Henao, Ricardo, Yuan, Xin, Carin, Lawrence
A new Bayesian formulation is developed for nonlinear support vector machines (SVMs), based on a Gaussian process and with the SVM hinge loss expressed as a scaled mixture of normals. We then integrate the Bayesian SVM into a factor model, in which feature learning and nonlinear classifier design are performed jointly; almost all previous work on such discriminative feature learning has assumed a linear classifier. Inference is performed with expectation conditional maximization (ECM) and Markov Chain Monte Carlo (MCMC). An extensive set of experiments demonstrate the utility of using a nonlinear Bayesian SVM within discriminative feature learning and factor modeling, from the standpoints of accuracy and interpretability
Object Localization based on Structural SVM using Privileged Information
Feyereisl, Jan, Kwak, Suha, Son, Jeany, Han, Bohyung
We propose a structured prediction algorithm for object localization based on Support Vector Machines (SVMs) using privileged information. Privileged information provides useful high-level knowledge for image understanding and facilitates learning a reliable model even with a small number of training examples. In our setting, we assume that such information is available only at training time since it may be difficult to obtain from visual data accurately without human supervision. Our goal is to improve performance by incorporating privileged information into ordinary learning framework and adjusting model parameters for better generalization. We tackle object localization problem based on a novel structural SVM using privileged information, where an alternating loss-augmented inference procedure is employed to handle the term in the objective function corresponding to privileged information. We apply the proposed algorithm to the Caltech-UCSD Birds 200-2011 dataset, and obtain encouraging results suggesting further investigation into the benefit of privileged information in structured prediction.
Machine Learning for Neuroimaging with Scikit-Learn
Abraham, Alexandre, Pedregosa, Fabian, Eickenberg, Michael, Gervais, Philippe, Muller, Andreas, Kossaifi, Jean, Gramfort, Alexandre, Thirion, Bertrand, Varoquaux, Gรคel
Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g. multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g. resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.
On the Impossibility of Convex Inference in Human Computation
Shah, Nihar B., Zhou, Dengyong
Human computation or crowdsourcing involves joint inference of the ground-truth-answers and the worker-abilities by optimizing an objective function, for instance, by maximizing the data likelihood based on an assumed underlying model. A variety of methods have been proposed in the literature to address this inference problem. As far as we know, none of the objective functions in existing methods is convex. In machine learning and applied statistics, a convex function such as the objective function of support vector machines (SVMs) is generally preferred, since it can leverage the high-performance algorithms and rigorous guarantees established in the extensive literature on convex optimization. One may thus wonder if there exists a meaningful convex objective function for the inference problem in human computation. In this paper, we investigate this convexity issue for human computation. We take an axiomatic approach by formulating a set of axioms that impose two mild and natural assumptions on the objective function for the inference. Under these axioms, we show that it is unfortunately impossible to ensure convexity of the inference problem. On the other hand, we show that interestingly, in the absence of a requirement to model "spammers", one can construct reasonable objective functions for crowdsourcing that guarantee convex inference.
A Joint Probabilistic Classification Model of Relevant and Irrelevant Sentences in Mathematical Word Problems
Cetintas, Suleyman, Si, Luo, Xin, Yan Ping, Zhang, Dake, Park, Joo Young, Tzur, Ron
Estimating the difficulty level of math word problems is an important task for many educational applications. Identification of relevant and irrelevant sentences in math word problems is an important step for calculating the difficulty levels of such problems. This paper addresses a novel application of text categorization to identify two types of sentences in mathematical word problems, namely relevant and irrelevant sentences. A novel joint probabilistic classification model is proposed to estimate the joint probability of classification decisions for all sentences of a math word problem by utilizing the correlation among all sentences along with the correlation between the question sentence and other sentences, and sentence text. The proposed model is compared with i) a SVM classifier which makes independent classification decisions for individual sentences by only using the sentence text and ii) a novel SVM classifier that considers the correlation between the question sentence and other sentences along with the sentence text. An extensive set of experiments demonstrates the effectiveness of the joint probabilistic classification model for identifying relevant and irrelevant sentences as well as the novel SVM classifier that utilizes the correlation between the question sentence and other sentences. Furthermore, empirical results and analysis show that i) it is highly beneficial not to remove stopwords and ii) utilizing part of speech tagging does not make a significant improvement although it has been shown to be effective for the related task of math word problem type classification.
HIPAD - A Hybrid Interior-Point Alternating Direction algorithm for knowledge-based SVM and feature selection
Qin, Zhiwei, Tang, Xiaocheng, Akrotirianakis, Ioannis, Chakraborty, Amit
We consider classification tasks in the regime of scarce labeled training data in high dimensional feature space, where specific expert knowledge is also available. We propose a new hybrid optimization algorithm that solves the elastic-net support vector machine (SVM) through an alternating direction method of multipliers in the first phase, followed by an interior-point method for the classical SVM in the second phase. Both SVM formulations are adapted to knowledge incorporation. Our proposed algorithm addresses the challenges of automatic feature selection, high optimization accuracy, and algorithmic flexibility for taking advantage of prior knowledge. We demonstrate the effectiveness and efficiency of our algorithm and compare it with existing methods on a collection of synthetic and real-world data.
Convex Optimization for Big Data
Cevher, Volkan, Becker, Stephen, Schmidt, Mark
This article reviews recent advances in convex optimization algorithms for Big Data, which aim to reduce the computational, storage, and communications bottlenecks. We provide an overview of this emerging field, describe contemporary approximation techniques like first-order methods and randomization for scalability, and survey the important role of parallel and distributed computation. The new Big Data algorithms are based on surprisingly simple principles and attain staggering accelerations even on classical problems. However, the importance of convex formulations and optimization has increased even more dramatically in the last decade due to the rise of new theory for structured sparsity and rank minimization, and successful statistical learning models like support vector machines. These formulations are now employed in a wide variety of signal processing applications including compressive sensing, medical imaging, geophysics, and bioinformatics [1-4]. There are several important reasons for this explosion of interest, with two of the most obvious ones being the existence of efficient algorithms for computing globally optimal solutions and the ability to use convex geometry to prove useful properties about the solution [1, 2]. A unified convex formulation also transfers useful knowledge across different disciplines, such as sampling and computation, that focus on different aspects of the same underlying mathematical problem [5]. However, the renewed popularity of convex optimization places convex algorithms under tremendous pressure to accommodate increasingly large data sets and to solve problems in unprecedented dimensions. In response, convex optimization is reinventing itself for Big Data where the data and parameter sizes of optimization problems are too large to process locally, and where even basic linear algebra routines like Cholesky decompositions and matrix-matrix or matrix-vector multiplications that algorithms take for granted are prohibitive.
Understanding Touch Gestures on a Humanoid Robot
Lawson, Wallace E. (Naval Research Lab) | Sullivan, Keith (Excelis) | Trafton, Greg (Naval Research Lab)
Touch can be a powerful means of communication especially when it is combined with other sensing modalities, such as speech. The challenge on a humanoid robot is to sense touch in a way that can be sensitive to subtle cues, such as the hand used and amount of force applied. We propose a novel combination of sensing modalities to extract touch information. We extract hand information using the Leap Motion active sensor, then determine force information from force sensitive resistors. We combine these sensing modalities at the feature level, then train a support vector machine to recognize specific touch gestures. We demonstrate a high level of accuracy recognizing four different touch gestures from the firefighting domain.
Fully Automated Myocardial Infarction Classification using Ordinary Differential Equations
Portable, Wearable and Wireless electrocardiogram (ECG) Systems have the potential to be used as point-of-care for cardiovascular disease diagnostic systems. Such wearable and wireless ECG systems require automatic detection of cardiovascular disease. Even in the primary care, automation of ECG diagnostic systems will improve efficiency of ECG diagnosis and reduce the minimal training requirement of local healthcare workers. However, few fully automatic myocardial infarction (MI) disease detection algorithms have well been developed. This paper presents a novel automatic MI classification algorithm using second order ordinary differential equation (ODE) with time varying coefficients, which simultaneously captures morphological and dynamic feature of highly correlated ECG signals. By effectively estimating the unobserved state variables and the parameters of the second order ODE, the accuracy of the classification was significantly improved. The estimated time varying coefficients of the second order ODE were used as an input to the support vector machine (SVM) for the MI classification. The proposed method was applied to the PTB diagnostic ECG database within Physionet. The overall sensitivity, specificity, and classification accuracy of 12 lead ECGs for MI binary classifications were 98.7%, 96.4% and 98.3%, respectively. We also found that even using one lead ECG signals, we can reach accuracy as high as 97%. Multiclass MI classification is a challenging task but the developed ODE approach for 12 lead ECGs coupled with multiclass SVM reached 96.4% accuracy for classifying 5 subgroups of MI and healthy controls.
Classification of Autism Spectrum Disorder Using Supervised Learning of Brain Connectivity Measures Extracted from Synchrostates
Jamal, Wasifa, Das, Saptarshi, Oprescu, Ioana-Anastasia, Maharatna, Koushik, Apicella, Fabio, Sicca, Federico
Objective. The paper investigates the presence of autism using the functional brain connectivity measures derived from electro-encephalogram (EEG) of children during face perception tasks. Approach. Phase synchronized patterns from 128-channel EEG signals are obtained for typical children and children with autism spectrum disorder (ASD). The phase synchronized states or synchrostates temporally switch amongst themselves as an underlying process for the completion of a particular cognitive task. We used 12 subjects in each group (ASD and typical) for analyzing their EEG while processing fearful, happy and neutral faces. The minimal and maximally occurring synchrostates for each subject are chosen for extraction of brain connectivity features, which are used for classification between these two groups of subjects. Among different supervised learning techniques, we here explored the discriminant analysis and support vector machine both with polynomial kernels for the classification task. Main results. The leave one out cross-validation of the classification algorithm gives 94.7% accuracy as the best performance with corresponding sensitivity and specificity values as 85.7% and 100% respectively. Significance. The proposed method gives high classification accuracies and outperforms other contemporary research results. The effectiveness of the proposed method for classification of autistic and typical children suggests the possibility of using it on a larger population to validate it for clinical practice.