AITopics | Support Vector Machines

Collaborating Authors

Support Vector Machines

Support vector machines (SVMs, also support vector networks[1]) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds

Bassily, Raef, Smith, Adam, Thakurta, Abhradeep

arXiv.org Machine LearningOct-17-2014

In this paper, we initiate a systematic investigation of differentially private algorithms for convex empirical risk minimization. Various instantiations of this problem have been studied before. We provide new algorithms and matching lower bounds for private ERM assuming only that each data point's contribution to the loss function is Lipschitz bounded and that the domain of optimization is bounded. We provide a separate set of algorithms and matching lower bounds for the setting in which the loss functions are known to also be strongly convex. Our algorithms run in polynomial time, and in some cases even match the optimal non-private running time (as measured by oracle complexity). We give separate algorithms (and lower bounds) for $(\epsilon,0)$- and $(\epsilon,\delta)$-differential privacy; perhaps surprisingly, the techniques used for designing optimal algorithms in the two cases are completely different. Our lower bounds apply even to very simple, smooth function families, such as linear and quadratic functions. This implies that algorithms from previous work can be used to obtain optimal error rates, under the additional assumption that the contributions of each data point to the loss function is smooth. We show that simple approaches to smoothing arbitrary loss functions (in order to apply previous techniques) do not yield optimal error rates. In particular, optimal algorithms were not previously known for problems such as training support vector machines and the high-dimensional median.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1405.7085

Country: North America > United States (0.27)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Add feedback

Complexity Issues and Randomization Strategies in Frank-Wolfe Algorithms for Machine Learning

Frandi, Emanuele, Nanculef, Ricardo, Suykens, Johan

arXiv.org Machine LearningOct-15-2014

Frank-Wolfe algorithms for convex minimization have recently gained considerable attention from the Optimization and Machine Learning communities, as their properties make them a suitable choice in a variety of applications. However, as each iteration requires to optimize a linear model, a clever implementation is crucial to make such algorithms viable on large-scale datasets. For this purpose, approximation strategies based on a random sampling have been proposed by several researchers. In this work, we perform an experimental study on the effectiveness of these techniques, analyze possible alternatives and provide some guidelines based on our results.

algorithm, artificial intelligence, machine learning, (12 more...)

arXiv.org Machine Learning

1410.4062

Country:

North America > United States (0.49)
Europe (0.47)

Genre: Research Report > New Finding (1.00)

Industry: Government (0.33)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.49)

Add feedback

Fast Multilevel Support Vector Machines

Razzaghi, Talayeh, Safro, Ilya

arXiv.org Machine LearningOct-13-2014

Solving different types of optimization models (including parameters fitting) for support vector machines on large-scale training data is often an expensive computational task. This paper proposes a multilevel algorithmic framework that scales efficiently to very large data sets. Instead of solving the whole training set in one optimization process, the support vectors are obtained and gradually refined at multiple levels of coarseness of the data. The proposed framework includes: (a) construction of hierarchy of large-scale data coarse representations, and (b) a local processing of updating the hyperplane throughout this hierarchy. Our multilevel framework substantially improves the computational time without loosing the quality of classifiers. The algorithms are demonstrated for both regular and weighted support vector machines. Experimental results are presented for balanced and imbalanced classification problems. Quality improvement on several imbalanced data sets has been observed.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1410.3348

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Markov Random Fields and Mass Spectra Discrimination

Kong, Ao, Azencott, Robert

arXiv.org Machine LearningOct-13-2014

Mass spectrometry can involve two soft ionization techniques: matrix-assisted laser desorption ionization (MALDI) and surface-enhanced laser desorption and ionization (SELDI). For each analyzed fluid sample, MALDI or SELDI hardwares generate a high-dimensional mass spectrum, recording between 10,000 and 20,000 "mass-to-charge (m/z) ratios" corresponding to the ionized peptides present in the fluid sample, as well as "intensities" roughly quantifying the concentrations of these peptides in the sample. Generally m/z ratios take values anywhere between 200 and 20,000 Daltons, and are acquired with a known relative accuracy ρ which depends on the acquisition modalities, and ranges from 0.1% to 0.3%. Analyzing this type of high dimensional data oftern requires specialized software tools, implementing sophisticated machine learning techniques such as SVM (support vector machines) (Li and others (2004), Yu and others (2005)), artificial neural networks (Ball and others (2002)), or random forests (Izmirlian (2004)). These techniques typically generate "black-box" classifiers, which often reach good discrimination levels between cancerous and control groups, but are difficult to interpret biologically in terms of characteristic biomarkers patterns. This often leads to unexpected performance variations on totally new data sets. To develop clinically usable software tools for analysis of mass spectra acuired by MALDI or SELDI hardwares, a key step is to implement automated discovery of explicit "signatures", i.e. short lists of proteomic biomarkers with high discriminating powers between cancer groups (Yasui and others (2003)). Some easily interpretable automatic classifiers, such as linear combinations of biomarker weights (Wang and Chang (2011)), can be found in previous studies, but these approaches do not attempt to quantify the discriminating impact of simultaneous presence for specific pairs of biomarkers.

artificial intelligence, machine learning, mass spectra, (15 more...)

arXiv.org Machine Learning

1410.3234

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.50)

Add feedback

Generalization Bounds for Learning with Linear, Polygonal, Quadratic and Conic Side Knowledge

Tulabandhula, Theja, Rudin, Cynthia

arXiv.org Machine LearningOct-7-2014

Mach Learn manuscript No. (will be inserted by the editor) Abstract In this paper, we consider a supervised learning setting where side knowledge is provided about the labels of unlabeled examples. The side knowledge has the effect of reducing the hypothesis space, leading to tighter generalization bounds, and thus possibly better generalization. We consider several types of side knowledge, the first leading to linear and polygonal constraints on the hypothesis space, the second leading to quadratic constraints, and the last leading to conic constraints. We show how different types of domain knowledge can lead directly to these kinds of side knowledge. We prove bounds on complexity measures of the hypothesis space for quadratic and conic side knowledge, and show that these bounds are tight in a specific sense for the quadratic case. Keywords statistical learning theory · generalization bounds · Rademacher complexity · covering numbers, constrained linear function classes · side knowledge 1 Introduction Surely, for many applications the amount of domain knowledge we could potentially use within our learning processes is vastly larger than the amount of domain knowledge we actually use. One reason for this is that domain knowledge may be nontrivial to incorporate into algorithms or analysis. For example, researchers in NLP (Natural Language Processing) have long figured out various exotic domain specific knowledge that one can use while performing a learning task [Chang et al., 2008a,b]. The present work aims to provide theoretical guarantees for a large class of problems with a general type of domain knowledge that goes beyond sparsity and smoothness. To define this large class of problems, we will keep the usual supervised learning assumption that the training examples are drawn i.i.d.

artificial intelligence, constraint, machine learning, (17 more...)

arXiv.org Machine Learning

1405.7764

Country: North America > United States > Massachusetts (0.46)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Fast Prediction with SVM Models Containing RBF Kernels

Claesen, Marc, De Smet, Frank, Suykens, Johan A. K., De Moor, Bart

arXiv.org Machine LearningOct-3-2014

We present an approximation scheme for support vector machine models that use an RBF kernel. A second-order Maclaurin series approximation is used for exponentials of inner products between support vectors and test instances. The approximation is applicable to all kernel methods featuring sums of kernel evaluations and makes no assumptions regarding data normalization. The prediction speed of approximated models no longer relates to the amount of support vectors but is quadratic in terms of the number of input dimensions. If the number of input dimensions is small compared to the amount of support vectors, the approximated model is significantly faster in prediction and has a smaller memory footprint. An optimized C++ implementation was made to assess the gain in prediction speed in a set of practical tests. We additionally provide a method to verify the approximation accuracy, prior to training models or during run-time, to ensure the loss in accuracy remains acceptable and within known bounds.

approximation, artificial intelligence, machine learning, (14 more...)

arXiv.org Machine Learning

1403.0736

Country: Europe > Belgium (0.15)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Dictionary learning for fast classification based on soft-thresholding

Fawzi, Alhussein, Davies, Mike, Frossard, Pascal

arXiv.org Machine LearningOct-2-2014

Classifiers based on sparse representations have recently been shown to provide excellent results in many visual recognition and classification tasks. However, the high cost of computing sparse representations at test time is a major obstacle that limits the applicability of these methods in large-scale problems, or in scenarios where computational power is restricted. We consider in this paper a simple yet efficient alternative to sparse coding for feature extraction. We study a classification scheme that applies the soft-thresholding nonlinear mapping in a dictionary, followed by a linear classifier. A novel supervised dictionary learning algorithm tailored for this low complexity classification architecture is proposed. The dictionary learning problem, which jointly learns the dictionary and linear classifier, is cast as a difference of convex (DC) program and solved efficiently with an iterative DC solver. We conduct experiments on several datasets, and show that our learning algorithm that leverages the structure of the classification problem outperforms generic learning procedures. Our simple classifier based on soft-thresholding also competes with the recent sparse coding classifiers, when the dictionary is learned appropriately. The adopted classification scheme further requires less computational time at the testing stage, compared to other classifiers. The proposed scheme shows the potential of the adequately trained soft-thresholding mapping for classification and paves the way towards the development of very efficient classification methods for vision problems.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1402.1973

Country:

Europe (0.46)
North America (0.28)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)

Add feedback

Taking into Account the Differences between Actively and Passively Acquired Data: The Case of Active Learning with Support Vector Machines for Imbalanced Datasets

Bloodgood, Michael, Vijay-Shanker, K.

arXiv.org Machine LearningSep-16-2014

Actively sampled data can have very different characteristics than passively sampled data. Therefore, it's promising to investigate using different inference procedures during AL than are used during passive learning (PL). This general idea is explored in detail for the focused case of AL with cost-weighted SVMs for imbalanced data, a situation that arises for many HLT tasks. The key idea behind the proposed InitPA method for addressing imbalance is to base cost models during AL on an estimate of overall corpus imbalance computed via a small unbiased sample rather than the imbalance in the labeled training data, which is the leading method used during PL.

artificial intelligence, imbalance, machine learning, (14 more...)

arXiv.org Machine Learning

1409.4835

Country: North America > United States > Delaware > New Castle County > Newark (0.14)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

An Approach to Reducing Annotation Costs for BioNLP

Bloodgood, Michael, Vijay-Shanker, K.

arXiv.org Machine LearningSep-12-2014

There is a broad range of BioNLP tasks for which active learning (AL) can significantly reduce annotation costs and a specific AL algorithm we have developed is particularly effective in reducing annotation costs for these tasks. We have previously developed an AL algorithm called ClosestInitPA that works best with tasks that have the following characteristics: redundancy in training material, burdensome annotation costs, Support Vector Machines (SVMs) work well for the task, and imbalanced datasets (i.e. when set up as a binary classification problem, one class is substantially rarer than the other). Many BioNLP tasks have these characteristics and thus our AL algorithm is a natural approach to apply to BioNLP tasks.

artificial intelligence, criterion, machine learning, (15 more...)

arXiv.org Machine Learning

1409.3881

Country: North America > United States > Delaware > New Castle County > Newark (0.17)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.73)

Add feedback

Global Convergence of Online Limited Memory BFGS

Mokhtari, Aryan, Ribeiro, Alejandro

arXiv.org Machine LearningSep-6-2014

Global convergence of an online (stochastic) limited memory version of the Broyden-Fletcher- Goldfarb-Shanno (BFGS) quasi-Newton method for solving optimization problems with stochastic objectives that arise in large scale machine learning is established. Lower and upper bounds on the Hessian eigenvalues of the sample functions are shown to suffice to guarantee that the curvature approximation matrices have bounded determinants and traces, which, in turn, permits establishing convergence to optimal arguments with probability 1. Numerical experiments on support vector machines with synthetic data showcase reductions in convergence time relative to stochastic gradient descent algorithms as well as reductions in storage and computation relative to other online quasi-Newton methods. Experimental evaluation on a search engine advertising problem corroborates that these advantages also manifest in practical applications.

artificial intelligence, feature vector, machine learning, (15 more...)

arXiv.org Machine Learning

1409.2045

Country:

North America > United States > Pennsylvania (0.27)
North America > United States > Michigan (0.27)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Add feedback