Goto

Collaborating Authors

 Support Vector Machines


Fast Sampling for Bayesian Max-Margin Models

arXiv.org Artificial Intelligence

Bayesian max-margin models have shown superiority in various practical applications, such as text categorization, collaborative prediction, social network link prediction and crowdsourcing, and they conjoin the flexibility of Bayesian modeling and predictive strengths of max-margin learning. However, Monte Carlo sampling for these models still remains challenging, especially for applications that involve large-scale datasets. In this paper, we present the stochastic subgradient Hamiltonian Monte Carlo (HMC) methods, which are easy to implement and computationally efficient. We show the approximate detailed balance property of subgradient HMC which reveals a natural and validated generalization of the ordinary HMC. Furthermore, we investigate the variants that use stochastic subsampling and thermostats for better scalability and mixing. Using stochastic subgradient Markov Chain Monte Carlo (MCMC), we efficiently solve the posterior inference task of various Bayesian max-margin models and extensive experimental results demonstrate the effectiveness of our approach.


Semi-Supervised Active Learning for Support Vector Machines: A Novel Approach that Exploits Structure Information in Data

arXiv.org Machine Learning

In our today's information society more and more data emerges, e.g.~in social networks, technical applications, or business applications. Companies try to commercialize these data using data mining or machine learning methods. For this purpose, the data are categorized or classified, but often at high (monetary or temporal) costs. An effective approach to reduce these costs is to apply any kind of active learning (AL) methods, as AL controls the training process of a classifier by specific querying individual data points (samples), which are then labeled (e.g., provided with class memberships) by a domain expert. However, an analysis of current AL research shows that AL still has some shortcomings. In particular, the structure information given by the spatial pattern of the (un)labeled data in the input space of a classification model (e.g.,~cluster information), is used in an insufficient way. In addition, many existing AL techniques pay too little attention to their practical applicability. To meet these challenges, this article presents several techniques that together build a new approach for combining AL and semi-supervised learning (SSL) for support vector machines (SVM) in classification tasks. Structure information is captured by means of probabilistic models that are iteratively improved at runtime when label information becomes available. The probabilistic models are considered in a selection strategy based on distance, density, diversity, and distribution (4DS strategy) information for AL and in a kernel function (Responsibility Weighted Mahalanobis kernel) for SVM. The approach fuses generative and discriminative modeling techniques. With 20 benchmark data sets and with the MNIST data set it is shown that our new solution yields significantly better results than state-of-the-art methods.


Exploring the Entire Regularization Path for the Asymmetric Cost Linear Support Vector Machine

arXiv.org Machine Learning

We propose an algorithm for exploring the entire regularization path of asymmetric-cost linear support vector machines. Empirical evidence suggests the predictive power of support vector machines depends on the regularization parameters of the training algorithms. The algorithms exploring the entire regularization paths have been proposed for single-cost support vector machines thereby providing the complete knowledge on the behavior of the trained model over the hyperparameter space. Considering the problem in two-dimensional hyperparameter space though enables our algorithm to maintain greater flexibility in dealing with special cases and sheds light on problems encountered by algorithms building the paths in one-dimensional spaces. We demonstrate two-dimensional regularization paths for linear support vector machines that we train on synthetic and real data.


Recursion-Free Online Multiple Incremental/Decremental Analysis Based on Ridge Support Vector Learning

arXiv.org Machine Learning

Th is study presents a rapid multiple incremental and decremental mechanism ba sed on Weight - Error Curves (WECs) fo r support - vector a nalysi s . To ha ndle rapidly increas ing amounts of data, recursion - free computation is proposed for predicting the Lagrangian multipliers of new samples . This study examines the characteristics of Ridge S upport V ector M odels, including Ridge S upport V ector Machines and Regression, subsequently devis ing a recursion - free function derived from WECs . With this proposed function, a ll of the new Lagrang ian multipliers can be computed at once without using any gradual step sizes. Moreover, such a function can relax a constraint, where the increment of new multiple Lagrang ian multipliers should be the same in the previous work, thereby easily satisfying the requirement of Karush - Kuhn - Tucker (KKT) conditions . The proposed mechanism no longer requires t ypical time - consuming bookkeeping strategies, which compute the step size by checking all the training samples in each incremental round. Experiments were carried out on open datasets for evaluating our work. The results showed that the computation al speed was successfully enhanced, better than the baselines. Besides, the accuracy still remained. These findings revealed that the proposed method was appropriate for incremental/decremental learning, thereby demonstrating the effectiveness of the propose d idea.


There's an app for that! Using your smartphone to test for Anemia. ยป Behind the Headlines

#artificialintelligence

I'd be willing to bet that if you were asked to list ten uses for your smartphone, you probably wouldn't include "medical device" in your answer. But as smartphones become increasingly capable, highly-portable computing platforms, researchers are looking to the computer in everyone's pocket as a way to improve global health. As Wired UK declared earlier this year, the next revolutionary medical device is likely to be your smartphone. Scientists have already developed smartphone-based apps that can monitor asthma, detect skin cancer, and diagnose traumatic brain injuries. The latest app that joins the "doctor in your pocket" list is helping screen for anemia.


The rapid evolution of open-source machine learning โ€“ Seldon -- Open Source Machine Learning

#artificialintelligence

When millions of people across the world tuned in to watch DeepMind's machine beat the human Go world champion Lee Sedol, they also witnessed a historic victory for open-source. DeepMind used a scientific computing framework called Torch extensively in the development and execution of AlphaGo's neural networks. Torch was first released back in 2002 under a BSD open-source license with algorithms that are still commonly used by data scientists such as multi-layer perceptrons, support vector machines and K-nearest neighbours. Torch also supported ensembles -- a popular technique that combines the output of multiple algorithms, usually with a weighted average. It's not just open-source software that contributed to the growth of machine learning.


Stealing Machine Learning Models via Prediction APIs

arXiv.org Machine Learning

Machine learning (ML) models may be deemed confidential due to their sensitive training data, commercial value, or use in security applications. Increasingly often, confidential ML models are being deployed with publicly accessible query interfaces. ML-as-a-service ("predictive analytics") systems are an example: Some allow users to train models on potentially sensitive data and charge others for access on a pay-per-query basis. The tension between model confidentiality and public access motivates our investigation of model extraction attacks. In such attacks, an adversary with black-box access, but no prior knowledge of an ML model's parameters or training data, aims to duplicate the functionality of (i.e., "steal") the model. Unlike in classical learning theory settings, ML-as-a-service offerings may accept partial feature vectors as inputs and include confidence values with predictions. Given these practices, we show simple, efficient attacks that extract target ML models with near-perfect fidelity for popular model classes including logistic regression, neural networks, and decision trees. We demonstrate these attacks against the online services of BigML and Amazon Machine Learning. We further show that the natural countermeasure of omitting confidence values from model outputs still admits potentially harmful model extraction attacks. Our results highlight the need for careful ML model deployment and new model extraction countermeasures.


Minimum Density Hyperplanes

arXiv.org Machine Learning

Associating distinct groups of objects (clusters) with contiguous regions of high probability density (high-density clusters), is central to many statistical and machine learning approaches to the classification of unlabelled data. We propose a novel hyperplane classifier for clustering and semi-supervised classification which is motivated by this objective. The proposed minimum density hyperplane minimises the integral of the empirical probability density function along it, thereby avoiding intersection with high density clusters. We show that the minimum density and the maximum margin hyperplanes are asymptotically equivalent, thus linking this approach to maximum margin clustering and semi-supervised support vector classifiers. We propose a projection pursuit formulation of the associated optimisation problem which allows us to find minimum density hyperplanes efficiently in practice, and evaluate its performance on a range of benchmark data sets. The proposed approach is found to be very competitive with state of the art methods for clustering and semi-supervised classification.


Three Things About Data Science You Won't Find In the Books

#artificialintelligence

In case you haven't heard yet, Data Science is all the craze. Courses, posts, and schools are springing up everywhere. However, every time I take a look at one of those offerings, I see that a lot of emphasis is put on specific learning algorithms. Of course, understanding how logistic regression or deep learning works is cool, but once you start working with data, you find out that there are other things equally important, or maybe even more. I can't really blame these courses.


A Hybrid Machine Learning Method for Fusing fMRI and Genetic Data: Combining both Improves Classification of Schizophrenia

#artificialintelligence

We demonstrate a hybrid machine learning method to classify schizophrenia patients and healthy controls, using functional magnetic resonance imaging (fMRI) and single nucleotide polymorphism (SNP) data. The method consists of four stages: (1) SNPs with the most discriminating information between the healthy controls and schizophrenia patients are selected to construct a support vector machine ensemble (SNP-SVME). The method was evaluated by a fully validated leave-one-out method using 40 subjects (20 patients and 20 controls). The classification accuracy was: 0.74 for SNP-SVME, 0.82 for Voxel-SVME, 0.83 for ICA-SVMC, and 0.87 for Combined SNP-fMRI. Experimental results show that better classification accuracy was achieved by combining genetic and fMRI data than using either alone, indicating that genetic and brain function representing different, but partially complementary aspects, of schizophrenia etiopathology.