Support Vector Machines
Learning Kernels with Radiuses of Minimum Enclosing Balls
Kun Gai, Guangyun Chen, Chang-shui Zhang
In this paper, we point out that there exist scaling and initialization problems in most existing multiple kernel learning (MKL) approaches, which employ the large margin principle to jointly learn both a kernel and an SVM classifier. The reason is that the margin itself can not well describe how good a kernel is due to the negligence of the scaling. We use the ratio between the margin and the radius of the minimum enclosing ball to measure the goodness of a kernel, and present a new minimization formulation for kernel learning. This formulation is invariant to scalings of learned kernels, and when learning linear combination of basis kernels it is also invariant to scalings of basis kernels and to the types (e.g., L
One Class Restricted Kernel Machines
Quadir, A., Sajid, M., Tanveer, M.
Restricted kernel machines (RKMs) have demonstrated a significant impact in enhancing generalization ability in the field of machine learning. Recent studies have introduced various methods within the RKM framework, combining kernel functions with the least squares support vector machine (LSSVM) in a manner similar to the energy function of restricted boltzmann machines (RBM), such that a better performance can be achieved. However, RKM's efficacy can be compromised by the presence of outliers and other forms of contamination within the dataset. These anomalies can skew the learning process, leading to less accurate and reliable outcomes. To address this critical issue and to ensure the robustness of the model, we propose the novel one-class RKM (OCRKM). In the framework of OCRKM, we employ an energy function akin to that of the RBM, which integrates both visible and hidden variables in a nonprobabilistic setting. The formulation of the proposed OCRKM facilitates the seamless integration of one-class classification method with the RKM, enhancing its capability to detect outliers and anomalies effectively. The proposed OCRKM model is evaluated over UCI benchmark datasets. Experimental findings and statistical analyses consistently emphasize the superior generalization capabilities of the proposed OCRKM model over baseline models across all scenarios.
Estimation of Food Intake Quantity Using Inertial Signals from Smartwatches
Levi, Ioannis, Kyritsis, Konstantinos, Papapanagiotou, Vasileios, Tsakiridis, Georgios, Delopoulos, Anastasios
Accurate monitoring of eating behavior is crucial for managing obesity and eating disorders such as bulimia nervosa. At the same time, existing methods rely on multiple and/or specialized sensors, greatly harming adherence and ultimately, the quality and continuity of data. This paper introduces a novel approach for estimating the weight of a bite, from a commercial smartwatch. Our publicly-available dataset contains smartwatch inertial data from ten participants, with manually annotated start and end times of each bite along with their corresponding weights from a smart scale, under semi-controlled conditions. The proposed method combines extracted behavioral features such as the time required to load the utensil with food, with statistical features of inertial signals, that serve as input to a Support Vector Regression model to estimate bite weights. Under a leave-one-subject-out cross-validation scheme, our approach achieves a mean absolute error (MAE) of 3.99 grams per bite. To contextualize this performance, we introduce the improvement metric, that measures the relative MAE difference compared to a baseline model. Our method demonstrates a 17.41% improvement, while the adapted state-of-the art method shows a -28.89% performance against that same baseline. The results presented in this work establish the feasibility of extracting meaningful bite weight estimates from commercial smartwatch inertial sensors alone, laying the groundwork for future accessible, non-invasive dietary monitoring systems.
Bayesian Nonlinear Support Vector Machines and Discriminative Factor Modeling
Ricardo Henao, Xin Yuan, Lawrence Carin
A new Bayesian formulation is developed for nonlinear support vector machines (SVMs), based on a Gaussian process and with the SVM hinge loss expressed as a scaled mixture of normals. We then integrate the Bayesian SVM into a factor model, in which feature learning and nonlinear classifier design are performed jointly; almost all previous work on such discriminative feature learning has assumed a linear classifier. Inference is performed with expectation conditional maximization (ECM) and Markov Chain Monte Carlo (MCMC). An extensive set of experiments demonstrate the utility of using a nonlinear Bayesian SVM within discriminative feature learning and factor modeling, from the standpoints of accuracy and interpretability.
Predicting Useful Neighborhoods for Lazy Local Learning
Lazy local learning methods train a classifier "on the fly" at test time, using only a subset of the training instances that are most relevant to the novel test example. The goal is to tailor the classifier to the properties of the data surrounding the test example. Existing methods assume that the instances most useful for building the local model are strictly those closest to the test example. However, this fails to account for the fact that the success of the resulting classifier depends on the full distribution of selected training instances. Rather than simply gathering the test example's nearest neighbors, we propose to predict the subset of training data that is jointly relevant to training its local model. We develop an approach to discover patterns between queries and their "good" neighborhoods using large-scale multilabel classification with compressed sensing. Given a novel test point, we estimate both the composition and size of the training subset likely to yield an accurate local model. We demonstrate the approach on image classification tasks on SUN and aPascal and show its advantages over traditional global and local approaches.
Latent Support Measure Machines for Bag-of-Words Data Classification
Yuya Yoshikawa, Tomoharu Iwata, Hiroshi Sawada
In many classification problems, the input is represented as a set of features, e.g., the bag-of-words (BoW) representation of documents. Support vector machines (SVMs) are widely used tools for such classification problems. The performance of the SVMs is generally determined by whether kernel values between data points can be defined properly. However, SVMs for BoW representations have a major weakness in that the co-occurrence of different but semantically similar words cannot be reflected in the kernel calculation. To overcome the weakness, we propose a kernel-based discriminative classifier for BoW data, which we call the latent support measure machine (latent SMM). With the latent SMM, a latent vector is associated with each vocabulary term, and each document is represented as a distribution of the latent vectors for words appearing in the document. To represent the distributions efficiently, we use the kernel embeddings of distributions that hold high order moment information about distributions. Then the latent SMM finds a separating hyperplane that maximizes the margins between distributions of different classes while estimating latent vectors for words to improve the classification performance. In the experiments, we show that the latent SMM achieves state-of-the-art accuracy for BoW text classification, is robust with respect to its own hyper-parameters, and is useful to visualize words.
Object Localization based on Structural SVM using Privileged Information
Jan Feyereisl, Suha Kwak, Jeany Son, Bohyung Han
We propose a structured prediction algorithm for object localization based on Support Vector Machines (SVMs) using privileged information. Privileged information provides useful high-level knowledge for image understanding and facilitates learning a reliable model even with a small number of training examples. In our setting, we assume that such information is available only at training time since it may be difficult to obtain from visual data accurately without human supervision. Our goal is to improve performance by incorporating privileged information into ordinary learning framework and adjusting model parameters for better generalization. We tackle object localization problem based on a novel structural SVM using privileged information, where an alternating loss-augmented inference procedure is employed to handle the term in the objective function corresponding to privileged information. We apply the proposed algorithm to the Caltech-UCSD Birds 200-2011 dataset, and obtain encouraging results suggesting further investigation into the benefit of privileged information in structured prediction.