Support Vector Machines
An Approach to One-Bit Compressed Sensing Based on Probably Approximately Correct Learning Theory
Ahsen, Mehmet Eren, Vidyasagar, Mathukumalli
In this paper, the problem of one-bit compressed sensing (OBCS) is formulated as a problem in probably approximately correct (PAC) learning. It is shown that the Vapnik-Chervonenkis (VC-) dimension of the set of half-spaces in $\mathbb{R}^n$ generated by $k$-sparse vectors is bounded below by $k \lg (n/k)$ and above by $2k \lg (n/k)$, plus some round-off terms. By coupling this estimate with well-established results in PAC learning theory, we show that a consistent algorithm can recover a $k$-sparse vector with $O(k \lg (n/k))$ measurements, given only the signs of the measurement vector. This result holds for \textit{all} probability measures on $\mathbb{R}^n$. It is further shown that random sign-flipping errors result only in an increase in the constant in the $O(k \lg (n/k))$ estimate. Because constructing a consistent algorithm is not straight-forward, we present a heuristic based on the $\ell_1$-norm support vector machine, and illustrate that its computational performance is superior to a currently popular method.
Principal Boundary on Riemannian Manifolds
We revisit the classification problem and focus on nonlinear methods for classification on manifolds. For multivariate datasets lying on an embedded nonlinear Riemannian manifold within the higher-dimensional space, our aim is to acquire a classification boundary between the classes with labels. Motivated by the principal flow [Panaretos, Pham and Yao, 2014], a curve that moves along a path of the maximum variation of the data, we introduce the principal boundary. From the classification perspective, the principal boundary is defined as an optimal curve that moves in between the principal flows traced out from two classes of the data, and at any point on the boundary, it maximizes the margin between the two classes. We estimate the boundary in quality with its direction supervised by the two principal flows. We show that the principal boundary yields the usual decision boundary found by the support vector machine, in the sense that locally, the two boundaries coincide. By means of examples, we illustrate how to find, use and interpret the principal boundary.
Mastering Machine Learning with scikit-learn PACKT Books
This book examines machine learning models including logistic regression, decision trees, and support vector machines, and applies them to common problems such as categorizing documents and classifying images. It begins with the fundamentals of machine learning, introducing you to the supervised-unsupervised spectrum, the uses of training and test data, and evaluating models. You will learn how to use generalized linear models in regression problems, as well as solve problems with text and categorical features. You will be acquainted with the use of logistic regression, regularization, and the various loss functions that are used by generalized linear models. The book will also walk you through an example project that prompts you to label the most uncertain training examples.
Spectral Algorithms for Computing Fair Support Vector Machines
Classifiers and rating scores are prone to implicitly codifying biases, which may be present in the training data, against protected classes (i.e., age, gender, or race). So it is important to understand how to design classifiers and scores that prevent discrimination in predictions. This paper develops computationally tractable algorithms for designing accurate but fair support vector machines (SVM's). Our approach imposes a constraint on the covariance matrices conditioned on each protected class, which leads to a nonconvex quadratic constraint in the SVM formulation. We develop iterative algorithms to compute fair linear and kernel SVM's, which solve a sequence of relaxations constructed using a spectral decomposition of the nonconvex constraint. Its effectiveness in achieving high prediction accuracy while ensuring fairness is shown through numerical experiments on several data sets.
MLBench: How Good Are Machine Learning Clouds for Binary Classification Tasks on Structured Data?
Liu, Yu, Zhang, Hantian, Zeng, Luyuan, Wu, Wentao, Zhang, Ce
We conduct an empirical study of machine learning functionalities provided by major cloud service providers, which we call machine learning clouds. Machine learning clouds hold the promise of hiding all the sophistication of running large-scale machine learning: Instead of specifying how to run a machine learning task, users only specify what machine learning task to run and the cloud figures out the rest. Raising the level of abstraction, however, rarely comes free -- a performance penalty is possible. How good, then, are current machine learning clouds on real-world machine learning workloads? We study this question with a focus on binary classification problems. We present mlbench, a novel benchmark constructed by harvesting datasets from Kaggle competitions. We then compare the performance of the top winning code available from Kaggle with that of running machine learning clouds from both Azure and Amazon on mlbench. Our comparative study reveals the strength and weakness of existing machine learning clouds and points out potential future directions for improvement.
Self-Taught Support Vector Machine
In this paper, a new approach for classification of target task using limited labeled target data as well as enormous unlabeled source data is proposed which is called self-taught learning. The target and source data can be drawn from different distributions. In the previous approaches, covariate shift assumption is considered where the marginal distributions p(x) change over domains and the conditional distributions p(y|x) remain the same. In our approach, we propose a new objective function which simultaneously learns a common space T(.) where the conditional distributions over domains p(T(x)|y) remain the same and learns robust SVM classifiers for target task using both source and target data in the new representation. Hence, in the proposed objective function, the hidden label of the source data is also incorporated. We applied the proposed approach on Caltech-256, MSRC+LMO datasets and compared the performance of our algorithm to the available competing methods. Our method has a superior performance to the successful existing algorithms.
Hyperparameter Importance Across Datasets
With the advent of automated machine learning, automated hyperparameter optimization methods are by now routinely used. However, this progress is not yet matched by equal progress on automatic analyses that yield information beyond performance-optimizing hyperparameter settings. In this work, we aim to answer the following two questions: Given an algorithm, what are generally its most important hyperparameters, and what are good priors over their hyperparameters' ranges to draw values from? We present methodology and a framework to answer these questions based on meta-learning across many datasets. We apply this methodology using the experimental meta-data available on OpenML to determine the most important hyperparameters of support vector machines, random forests and Adaboost, and to infer priors for all their hyperparameters. Our results, obtained fully automatically, provide a quantitative basis to focus efforts in both manual algorithm design and in automated hyperparameter optimization. Our experiments confirm that the selected hyperparameters are indeed the most important ones and that our obtained priors also lead to improvements in hyperparameter optimization.
Game-Theoretic Design of Secure and Resilient Distributed Support Vector Machines with Adversaries
With a large number of sensors and control units in networked systems, distributed support vector machines (DSVMs) play a fundamental role in scalable and efficient multi-sensor classification and prediction tasks. However, DSVMs are vulnerable to adversaries who can modify and generate data to deceive the system to misclassification and misprediction. This work aims to design defense strategies for DSVM learner against a potential adversary. We establish a game-theoretic framework to capture the conflicting interests between the DSVM learner and the attacker. The Nash equilibrium of the game allows predicting the outcome of learning algorithms in adversarial environments, and enhancing the resilience of the machine learning through dynamic distributed learning algorithms. We show that the DSVM learner is less vulnerable when he uses a balanced network with fewer nodes and higher degree. We also show that adding more training samples is an efficient defense strategy against an attacker. We present secure and resilient DSVM algorithms with verification method and rejection method, and show their resiliency against adversary with numerical experiments.
Effects of Images with Different Levels of Familiarity on EEG
Evaluating human brain potentials during watching different images can be used for memory evaluation, information retrieving, guilty-innocent identification and examining the brain response. In this study, the effects of watching images, with different levels of familiarity, on subjects' Electroencephalogram (EEG) have been studied. Three different groups of images with three familiarity levels of "unfamiliar", "familiar" and "very familiar" have been considered for this study. EEG signals of 21 subjects (14 men) were recorded. After signal acquisition, pre-processing, including noise and artifact removal, were performed on epochs of data. Features, including spatial-statistical, wavelet, frequency and harmonic parameters, and also correlation between recording channels, were extracted from the data. Then, we evaluated the efficiency of the extracted features by using p-value and also an orthogonal feature selection method (combination of Gram-Schmitt method and Fisher discriminant ratio) for feature dimensional reduction. As the final step of feature selection, we used 'add-r take-away l' method for choosing the most discriminative features. For data classification, including all two-class and three-class cases, we applied Support Vector Machine (SVM) on the extracted features. The correct classification rates (CCR) for "unfamiliar-familiar", "unfamiliar-very familiar" and "familiar-very familiar" cases were 85.6%, 92.6%, and 70.6%, respectively. The best results of classifications were obtained in pre-frontal and frontal regions of brain. Also, wavelet, frequency and harmonic features were among the most discriminative features. Finally, in three-class case, the best CCR was 86.8%.
Chapter 2 : SVM (Support Vector Machine) -- Theory – Machine Learning 101 – Medium
Welcome to the second stepping stone of Supervised Machine Learning. Again, this chapter is divided into two parts. Part 2 (here) we take on small coding exercise challenge. If you haven't read the Naive Bayes, I would suggest you to read it thorough here. Don't worry, we shall learn in laymen terms.