Support Vector Machines
Machine Learning Techniques for Diagnostic Differentiation of Mild Cognitive Impairment and Dementia
Williams, Jennifer A. (Washington State University) | Weakley, Alyssa (Washington State University) | Cook, Diane J. (Washington State University) | Schmitter-Edgecombe, Maureen (Washington State University)
Detection of cognitive impairment, especially at the early stages, is critical. Such detection has traditionally been performed manually by one or more clinicians based on reports and test results. Machine learning algorithms offer an alternative method of detection that may provide an automated process and valuable insights into diagnosis and classification. In this paper, we explore the use of neuropsychological and demographic data to predict Clinical Dementia Rating (CDR) scores (no dementia, very mild dementia, dementia) and clinical diagnoses (cognitively healthy, mild cognitive impairment, dementia) through the implementation of four machine learning algorithms, naïve Bayes (NB), C4.5 decision tree (DT), back-propagation neural network (NN), and support vector machine (SVM). Additionally, a feature selection method for reducing the number of neuropsychological and demographic data needed to make an accurate diagnosis was investigated. The NB classifier provided the best accuracies, while the SVM classifier proved to offer some of the lowest accuracies. We also illustrate that with the use of feature selection, accuracies can be improved. The experiments reported in this paper indicate that artificial intelligence techniques can be used to automate aspects of clinical diagnosis of individuals with cognitive impairment.
Personality Traits Recognition on Social Network - Facebook
Alam, Firoj (University of Trento) | Stepanov, Evgeny A. (University of Trento) | Riccardi, Giuseppe (University of Trento)
For the natural and social interaction it is necessary to understand human behavior. Personality is one of the fundamental aspects, by which we can understand behavioral dispositions. It is evident that there is a strong correlation between users’ personality and the way they behave on online social network (e.g., Facebook). This paper presents automatic recognition of Big-5 personality traits on social network (Facebook) using users’ status text. For the automatic recognition we studied different classification methods such as SMO (Sequential Minimal Optimization for Support Vector Machine), Bayesian Logistic Regression (BLR) and Multinomial Naïve Bayes (MNB) sparse modeling. Performance of the systems had been measured using macro-averaged precision, recall and F1; weighted average accuracy (WA) and un-weighted average accuracy (UA). Our comparative study shows that MNB performs better than BLR and SMO for personality traits recognition on the social network data.
Hacking Smart Machines with Smarter Ones: How to Extract Meaningful Data from Machine Learning Classifiers
Ateniese, Giuseppe, Felici, Giovanni, Mancini, Luigi V., Spognardi, Angelo, Villani, Antonio, Vitali, Domenico
Machine Learning (ML) algorithms are used to train computers to perform a variety of complex tasks and improve with experience. Computers learn how to recognize patterns, make unintended decisions, or react to a dynamic environment. Certain trained machines may be more effective than others because they are based on more suitable ML algorithms or because they were trained through superior training sets. Although ML algorithms are known and publicly released, training sets may not be reasonably ascertainable and, indeed, may be guarded as trade secrets. While much research has been performed about the privacy of the elements of training sets, in this paper we focus our attention on ML classifiers and on the statistical information that can be unconsciously or maliciously revealed from them. We show that it is possible to infer unexpected but useful information from ML classifiers. In particular, we build a novel meta-classifier and train it to hack other classifiers, obtaining meaningful information about their training sets. This kind of information leakage can be exploited, for example, by a vendor to build more effective classifiers or to simply acquire trade secrets from a competitor's apparatus, potentially violating its intellectual property rights.
A Linear Approximation to the chi^2 Kernel with Geometric Convergence
Li, Fuxin, Lebanon, Guy, Sminchisescu, Cristian
We propose a new analytical approximation to the $\chi^2$ kernel that converges geometrically. The analytical approximation is derived with elementary methods and adapts to the input distribution for optimal convergence rate. Experiments show the new approximation leads to improved performance in image classification and semantic segmentation tasks using a random Fourier feature approximation of the $\exp-\chi^2$ kernel. Besides, out-of-core principal component analysis (PCA) methods are introduced to reduce the dimensionality of the approximation and achieve better performance at the expense of only an additional constant factor to the time complexity. Moreover, when PCA is performed jointly on the training and unlabeled testing data, further performance improvements can be obtained. Experiments conducted on the PASCAL VOC 2010 segmentation and the ImageNet ILSVRC 2010 datasets show statistically significant improvements over alternative approximation methods.
Robust Support Vector Machines for Speaker Verification Task
Zergat, Kawthar Yasmine, Amrouche, Abderrahmane
An important step in speaker verification is extracting features that best characterize the speaker voice. This paper investigates a front-end processing that aims at improving the performance of speaker verification based on the SVMs classifier, in text independent mode. This approach combines features based on conventional Mel-cepstral Coefficients (MFCCs) and Line Spectral Frequencies (LSFs) to constitute robust multivariate feature vectors. To reduce the high dimensionality required for training these feature vectors, we use a dimension reduction method called principal component analysis (PCA). In order to evaluate the robustness of these systems, different noisy environments have been used. The obtained results using TIMIT database showed that, using the paradigm that combines these spectral cues leads to a significant improvement in verification accuracy, especially with PCA reduction for low signal-to-noise ratio noisy environment.
Efficient Regularized Least-Squares Algorithms for Conditional Ranking on Relational Data
Pahikkala, Tapio, Airola, Antti, Stock, Michiel, De Baets, Bernard, Waegeman, Willem
In domains like bioinformatics, information retrieval and social network analysis, one can find learning tasks where the goal consists of inferring a ranking of objects, conditioned on a particular target object. We present a general kernel framework for learning conditional rankings from various types of relational data, where rankings can be conditioned on unseen data objects. We propose efficient algorithms for conditional ranking by optimizing squared regression and ranking loss functions. We show theoretically, that learning with the ranking loss is likely to generalize better than with the regression loss. Further, we prove that symmetry or reciprocity properties of relations can be efficiently enforced in the learned models. Experiments on synthetic and real-world data illustrate that the proposed methods deliver state-of-the-art performance in terms of predictive power and computational efficiency. Moreover, we also show empirically that incorporating symmetry or reciprocity properties can improve the generalization performance.
Emotional Expression Classification using Time-Series Kernels
Lorincz, Andras, Jeni, Laszlo, Szabo, Zoltan, Cohn, Jeffrey, Kanade, Takeo
Estimation of facial expressions, as spatio-temporal processes, can take advantage of kernel methods if one considers facial landmark positions and their motion in 3D space. We applied support vector classification with kernels derived from dynamic time-warping similarity measures. We achieved over 99% accuracy - measured by area under ROC curve - using only the 'motion pattern' of the PCA compressed representation of the marker point vector, the so-called shape parameters. Beyond the classification of full motion patterns, several expressions were recognized with over 90% accuracy in as few as 5-6 frames from their onset, about 200 milliseconds.
$\propto$SVM for learning with label proportions
Yu, Felix X., Liu, Dong, Kumar, Sanjiv, Jebara, Tony, Chang, Shih-Fu
We study the problem of learning with label proportions in which the training data is provided in groups and only the proportion of each class in each group is known. We propose a new method called proportion-SVM, or $\propto$SVM, which explicitly models the latent unknown instance labels together with the known group label proportions in a large-margin framework. Unlike the existing works, our approach avoids making restrictive assumptions about the data. The $\propto$SVM model leads to a non-convex integer programming problem. In order to solve it efficiently, we propose two algorithms: one based on simple alternating optimization and the other based on a convex relaxation. Extensive experiments on standard datasets show that $\propto$SVM outperforms the state-of-the-art, especially for larger group sizes.
One-Class Support Measure Machines for Group Anomaly Detection
Muandet, Krikamol, Schölkopf, Bernhard
We propose one-class support measure machines (OCSMMs) for group anomaly detection which aims at recognizing anomalous aggregate behaviors of data points. The OCSMMs generalize well-known one-class support vector machines (OCSVMs) to a space of probability measures. By formulating the problem as quantile estimation on distributions, we can establish an interesting connection to the OCSVMs and variable kernel density estimators (VKDEs) over the input space on which the distributions are defined, bridging the gap between large-margin methods and kernel density estimators. In particular, we show that various types of VKDEs can be considered as solutions to a class of regularization problems studied in this paper. Experiments on Sloan Digital Sky Survey dataset and High Energy Particle Physics dataset demonstrate the benefits of the proposed framework in real-world applications.
Feature Ranking and Support Vector Machines Classification Analysis of the NSL-KDD Intrusion Detection Corpus
Calix, Ricardo A. (Purdue University Calumet) | Sankaran, Rajesh (Argonne National Laboratory)
Currently, signature based Intrusion Detection Systems (IDS) approaches are inadequate to address threats posed to networked systems by zero-day exploits. Statistical machine learning techniques offer a great opportunity to mitigate these threats. However, at this point, statistical based IDS systems are not mature enough to be implemented in realtime systems and the techniques to be used are not sufficiently understood. This study focuses on a recently expanded corpus for IDS analysis. Feature analysis and Support Vector Machines classification are performed to obtain a better understanding of the corpus and to establish a baseline set of results which can be used by other studies for comparison. Results of the classification and feature analysis are discussed.