Support Vector Machines
Randomized Kernel Methods for Least-Squares Support Vector Machines
The least-squares support vector machine is a frequently used kernel method for non-linear regression and classification tasks. Here we discuss several approximation algorithms for the least-squares support vector machine classifier. The proposed methods are based on randomized block kernel matrices, and we show that they provide good accuracy and reliable scaling for multi-class classification problems with relatively large data sets. Also, we present several numerical experiments that illustrate the practical applicability of the proposed methods.
Data Driven Exploratory Attacks on Black Box Classifiers in Adversarial Domains
Sethi, Tegjyot Singh, Kantardzic, Mehmed
While modern day web applications aim to create impact at the civilization level, they have become vulnerable to adversarial activity, where the next cyber-attack can take any shape and can originate from anywhere. The increasing scale and sophistication of attacks, has prompted the need for a data driven solution, with machine learning forming the core of many cybersecurity systems. Machine learning was not designed with security in mind, and the essential assumption of stationarity, requiring that the training and testing data follow similar distributions, is violated in an adversarial domain. In this paper, an adversary's view point of a classification based system, is presented. Based on a formal adversarial model, the Seed-Explore-Exploit framework is presented, for simulating the generation of data driven and reverse engineering attacks on classifiers. Experimental evaluation, on 10 real world datasets and using the Google Cloud Prediction Platform, demonstrates the innate vulnerability of classifiers and the ease with which evasion can be carried out, without any explicit information about the classifier type, the training data or the application domain. The proposed framework, algorithms and empirical evaluation, serve as a white hat analysis of the vulnerabilities, and aim to foster the development of secure machine learning frameworks.
On the Use of Default Parameter Settings in the Empirical Evaluation of Classification Algorithms
Bagnall, Anthony, Cawley, Gavin C.
We demonstrate that, for a range of state-of-the-art machine learning algorithms, the differences in generalisation performance obtained using default parameter settings and using parameters tuned via cross-validation can be similar in magnitude to the differences in performance observed between state-of-the-art and uncompetitive learning systems. This means that fair and rigorous evaluation of new learning algorithms requires performance comparison against benchmark methods with best-practice model selection procedures, rather than using default parameter settings. We investigate the sensitivity of three key machine learning algorithms (support vector machine, random forest and rotation forest) to their default parameter settings, and provide guidance on determining sensible default parameter values for implementations of these algorithms. We also conduct an experimental comparison of these three algorithms on 121 classification problems and find that, perhaps surprisingly, rotation forest is significantly more accurate on average than both random forest and a support vector machine.
Universal Consistency and Robustness of Localized Support Vector Machines
This paper analyses properties of localized kernel based, nonparametric statistical machine learning methods, in particular of support vector machines (SVMs) and methods close to them. Caused by the enormous research activities there is abundance of general introductions to this field of computer science and statistics. Beside many publications in international journals there are summarizing textbooks like for example Cristianini & Shawe-Taylor (2000), Schölkopf & Smola (2001), Steinwart & Christmann (2008) or Cucker & Zhou (2007) from a mathematical or statistical point of view. Nevertheless, we want to give a short overview over the analyzed topic. Support vector machines were initially introduced by Boser, Guyon & Vapnik (1992) und Cortes & Vapnik (1995), based on earlier work like the Russian original of Vapnik, Chervonenkis & Červonenkis (1979).
Locked-In ALS Patients Answer Yes or No Questions with Wearable fNIRS Device
Despite partial success, communication has remained impossible for persons suffering from complete motor paralysis but intact cognitive and emotional processing, a state called complete locked-in state (CLIS). Based on a motor learning theoretical context and on the failure of neuroelectric brain–computer interface (BCI) communication attempts in CLIS, we here report BCI communication using functional near-infrared spectroscopy (fNIRS) and an implicit attentional processing procedure. Four patients suffering from advanced amyotrophic lateral sclerosis (ALS)--two of them in permanent CLIS and two entering the CLIS without reliable means of communication--learned to answer personal questions with known answers and open questions all requiring a "yes" or "no" thought using frontocentral oxygenation changes measured with fNIRS. Three patients completed more than 46 sessions spread over several weeks, and one patient (patient W) completed 20 sessions. Online fNIRS classification of personal questions with known answers and open questions using linear support vector machine (SVM) resulted in an above-chance-level correct response rate over 70%.
Machine Learning App Development - Things You Must Have Missed - Algoworks
Project failures are very common in IT. This risk is higher if you are adopting a new technology and which is unfamiliar to your organization. Machine learning is not at all new to the world but development and awareness have now reached a point at which its benefits are becoming attractive for business. Though machine learning has a huge potential of reducing costs and finding new revenues by applying new technology aptly but if not implemented properly there could be many pitfalls. There is a lot to do for developers in machine learning as it offers the promise of applying business critical analytics to any applications.
Regularising Non-linear Models Using Feature Side-information
Mollaysa, Amina, Strasser, Pablo, Kalousis, Alexandros
Very often features come with their own vectorial descriptions which provide detailed information about their properties. We refer to these vectorial descriptions as feature side-information. In the standard learning scenario, input is represented as a vector of features and the feature side-information is most often ignored or used only for feature selection prior to model fitting. We believe that feature side-information which carries information about features intrinsic property will help improve model prediction if used in a proper way during learning process. In this paper, we propose a framework that allows for the incorporation of the feature side-information during the learning of very general model families to improve the prediction performance. We control the structures of the learned models so that they reflect features similarities as these are defined on the basis of the side-information. We perform experiments on a number of benchmark datasets which show significant predictive performance gains, over a number of baselines, as a result of the exploitation of the side-information.
Faster Coordinate Descent via Adaptive Importance Sampling
Perekrestenko, Dmytro, Cevher, Volkan, Jaggi, Martin
Coordinate descent methods employ random partial updates of decision variables in order to solve huge-scale convex optimization problems. In this work, we introduce new adaptive rules for the random selection of their updates. By adaptive, we mean that our selection rules are based on the dual residual or the primal-dual gap estimates and can change at each iteration. We theoretically characterize the performance of our selection rules and demonstrate improvements over the state-of-the-art, and extend our theory and algorithms to general convex objectives. Numerical evidence with hinge-loss support vector machines and Lasso confirm that the practice follows the theory.
Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets
Klein, Aaron, Falkner, Stefan, Bartels, Simon, Hennig, Philipp, Hutter, Frank
Bayesian optimization has become a successful tool for hyperparameter optimization of machine learning algorithms, such as support vector machines or deep neural networks. Despite its success, for large datasets, training and validating a single configuration often takes hours, days, or even weeks, which limits the achievable performance. To accelerate hyperparameter optimization, we propose a generative model for the validation error as a function of training set size, which is learned during the optimization process and allows exploration of preliminary configurations on small subsets, by extrapolating to the full dataset. We construct a Bayesian optimization procedure, dubbed Fabolas, which models loss and training time as a function of dataset size and automatically trades off high information gain about the global optimum against computational cost. Experiments optimizing support vector machines and deep neural networks show that Fabolas often finds high-quality solutions 10 to 100 times faster than other state-of-the-art Bayesian optimization methods or the recently proposed bandit strategy Hyperband.