Support Vector Machines
Estimation of scale functions to model heteroscedasticity by support vector machines
Hable, Robert, Christmann, Andreas
A main goal of regression is to derive statistical conclusions on the conditional distribution of the output variable Y given the input values x. Two of the most important characteristics of a single distribution are location and scale. Support vector machines (SVMs) are well established to estimate location functions like the conditional median or the conditional mean. We investigate the estimation of scale functions by SVMs when the conditional median is unknown, too. Estimation of scale functions is important e.g. to estimate the volatility in finance. We consider the median absolute deviation (MAD) and the interquantile range (IQR) as measures of scale. Our main result shows the consistency of MAD-type SVMs.
Qualitative Robustness of Support Vector Machines
Hable, Robert, Christmann, Andreas
Support vector machines have attracted much attention in theoretical and in applied statistics. Main topics of recent interest are consistency, learning rates and robustness. In this article, it is shown that support vector machines are qualitatively robust. Since support vector machines can be represented by a functional on the set of all probability measures, qualitative robustness is proven by showing that this functional is continuous with respect to the topology generated by weak convergence of probability measures. Combined with the existence and uniqueness of support vector machines, our results show that support vector machines are the solutions of a well-posed mathematical problem in Hadamard's sense.
Linearized Additive Classifiers
We revisit the additive model learning literature and adapt a penalized spline formulation due to Eilers and Marx [4], to train additive classifiers efficiently. We also propose two new embeddings based two classes of orthogonal basis with orthogonal derivatives, which can also be used to efficiently learn additive classifiers. This paper follows the popular theme in the current literature where kernel SVMs are learned much more efficiently using a approximate embedding and linear machine. In this paper we show that spline basis are especially well suited for learning additive models because of their sparsity structure and the ease of computing the embedding which enables one to train these models in an online manner, without incurring the memory overhead of precomputing the storing the embeddings. We show interesting connections between B-Spline basis and histogram intersection kernel and show that for a particular choice of regularization and degree of the B-Splines, our proposed learning algorithm closely approximates the histogram intersection kernel SVM. This enables one to learn additive models with almost no memory overhead compared to fast a linear solver, such as LIBLINEAR, while being only 5 6 slower on average. On two large scale image classification datasets, MNIST and Daimler Chrysler pedestrians, the proposed additive classifiers are as accurate as the kernel SVM, while being two orders of magnitude faster to train.
Kernel Bayes' rule
Fukumizu, Kenji, Song, Le, Gretton, Arthur
Kernel methods have long provided powerful tools for generalizing linear statistical approaches to nonlinear settings, through an embedding of the sample to a high dimensional feature space, namely a reproducing kernel Hilbert space (RKHS) [18, 28]. Examples include support vector machines, kernel PCA, and kernel CCA, among others. In these cases, data are mapped via a canonical feature map to a reproducing kernel Hilbert space (of high or even infinite dimension), in which the linear operations that define the algorithms are implemented. The inner product between feature mappings need never be computed explicitly, but is given by a positive definite kernel function unique to the RKHS: this permits efficient computation without the need to deal explicitly with the feature representation. The mappings of individual points to a feature space may be generalized to mappings of probability measures[e.g. 3, Chapter 4]. We call such mappings the kernel means of the underlying random variables.
Nonlinear Channel Estimation for OFDM System by Complex LS-SVM under High Mobility Conditions
Charrada, Anis, Samet, Abdelaziz
A nonlinear channel estimator using complex Least Square Support Vector Machines (LS-SVM) is proposed for pilot-aided OFDM system and applied to Long Term Evolution (LTE) downlink under high mobility conditions. The estimation algorithm makes use of the reference signals to estimate the total frequency response of the highly selective multipath channel in the presence of non-Gaussian impulse noise interfering with pilot signals. Thus, the algorithm maps trained data into a high dimensional feature space and uses the structural risk minimization (SRM) principle to carry out the regression estimation for the frequency response function of the highly selective channel. The simulations show the effectiveness of the proposed method which has good performance and high precision to track the variations of the fading channels compared to the conventional LS method and it is robust at high speed mobility.
Submodular Optimization for Efficient Semi-supervised Support Vector Machines
Emara, Wael, Kantardzic, Mehmed
Abstract--In this work we present a quadratic programming approximation of the Semi-Supervised Support V ector Machine (S3VM) problem, namely approximate QP-S3VM, that can be efficiently solved using off the shelf optimization packages. We prove that this approximate formulation establishes a relation between the low density separation and the graph-based models of semi-supervised learning (SSL) which is important to develop a unifying framework for semi-supervised learning methods. Furthermore, we propose the novel idea of representing SSL problems as submodular set functions and use efficient sub-modular optimization algorithms to solve them. Using this new idea we develop a representation of the approximate QP-S3VM as a maximization of a submodular set function which makes it possible to optimize using efficient greedy algorithms. We demonstrate that the proposed methods are accurate and provide significant improvement in time complexity over the state of the art in the literature. The recent advances in information technology imposes serious challenges on traditional machine learning algorithms where classification models are trained using labeled samples. Data collection and storage nowadays has never been easier and therefore using such enormous volumes of data to infer reliable classification models is of utmost importance.
A Microtext Corpus for Persuasion Detection in Dialog
Young, Joel (Naval Postgraduate School) | Martell, Craig (Naval Postgraduate School) | Anand, Pranav (University of California, Santa Cruz) | Ortiz, Pedro (United States Naval Academy) | Henry Tucker Gilbert, IV (Naval Postgraduate School)
Automatic detection of persuasion is essential for machine interaction on the social web. To facilitate automated persuasion detection, we present a novel microtext corpus derived from hostage negotiation transcripts as well as a detailed manual (codebook) for persuasion annotation. Our corpus, called the NPS Persuasion Corpus, consists of 37 transcripts from four sets of hostage negotiation transcriptions. Each utterance in the corpus is hand annotated for one of nine categories of persuasion based on Cialdini’s model: reciprocity, commitment, consistency, liking, authority, social proof, scarcity, other, and not persuasive. Initial results using three supervised learning algorithms (Na ̈ve Bayes, Maximum Entropy, and Support Vector Machines) combined with gappy and orthogonal sparse bigram feature expansion techniques show that the annotation process did capture machine learnable features of persuasion with F-scores better than baseline.
Gender Recognition Based on Sift Features
Yousefi, Sahar, Zahedi, Morteza
This paper proposes a robust approach for face detection and gender classification in color images. Previous researches about gender recognition suppose an expensive computational and time-consuming pre-processing step in order to alignment in which face images are aligned so that facial landmarks like eyes, nose, lips, chin are placed in uniform locations in image. In this paper, a novel technique based on mathematical analysis is represented in three stages that eliminates alignment step. First, a new color based face detection method is represented with a better result and more robustness in complex backgrounds. Next, the features which are invariant to affine transformations are extracted from each face using scale invariant feature transform (SIFT) method. To evaluate the performance of the proposed algorithm, experiments have been conducted by employing a SVM classifier on a database of face images which contains 500 images from distinct people with equal ratio of male and female.
Emerging Applications for Intelligent Diabetes Management
Marling, Cindy (Ohio University) | Wiley, Matthew (Ohio University ) | Bunescu, Razvan (Ohio University ) | Shubrook, Jay (Ohio University) | Schwartz, Frank (Ohio University)
Diabetes management is a difficult task for patients, who must monitor and control their blood glucose levels in order to avoid serious diabetic complications. It is a difficult task for physicians, who must manually interpret large volumes of blood glucose data to tailor therapy to the needs of each patient. This paper describes three emerging applications that employ AI to ease this task and shares difficulties encountered in transitioning AI technology from university researchers to patients and physicians.
NewsFinder: Automating an Artificial Intelligence News Service
Dong, Liang (Clemson University, South Carolina) | Smith, Reid G. (Marathon Oil Corporation) | Buchanan, Bruce G. (University of Pittsburgh)
NewsFinder automates the steps involved in finding, selecting and publishing news stories that meet subjective judgments of relevance and interest to the Artificial Intelligence community. NewsFinder combines a broad search with AI-specific filters and incorporates a learning program whose judgment of interestingness of stories can be trained by feedback from readers. Since August, 2010, the program has been used to operate the AI in the News service that is part of the AAAI AITopics site.