AITopics

1507.08074

Country: Europe > Russia (0.29)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.68)

Germain, Pascal, Lacasse, Alexandre, Laviolette, François, Marchand, Mario, Roy, Jean-Francis

Risk Bounds for the Majority Vote: From a PAC-Bayesian Analysis to a Learning Algorithm

arXiv.org Machine LearningJul-28-2015

We propose an extensive analysis of the behavior of majority votes in binary classification. In particular, we introduce a risk bound for majority votes, called the C-bound, that takes into account the average quality of the voters and their average disagreement. We also propose an extensive PAC-Bayesian analysis that shows how the C-bound can be estimated from various observations contained in the training data. The analysis intends to be self-contained and can be used as introductory material to PAC-Bayesian statistical learning theory. It starts from a general PAC-Bayesian perspective and ends with uncommon PAC-Bayesian bounds. Some of these bounds contain no Kullback-Leibler divergence and others allow kernel functions to be used as voters (via the sample compression setting). Finally, out of the analysis, we propose the MinCq learning algorithm that basically minimizes the C-bound. MinCq reduces to a simple quadratic program. Aside from being theoretically grounded, MinCq achieves state-of-the-art performance, as shown in our extensive empirical comparison with both AdaBoost and the Support Vector Machine.

artificial intelligence, machine learning, majority vote, (17 more...)

1503.08329

Country: North America > United States (0.27)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Eberts, Mona, Steinwart, Ingo

Optimal Learning Rates for Localized SVMs

arXiv.org Machine LearningJul-23-2015

One of the limiting factors of using support vector machines (SVMs) in large scale applications are their super-linear computational requirements in terms of the number of training samples. To address this issue, several approaches that train SVMs on many small chunks of large data sets separately have been proposed in the literature. So far, however, almost all these approaches have only been empirically investigated. In addition, their motivation was always based on computational requirements. In this work, we consider a localized SVM approach based upon a partition of the input space. For this local SVM, we derive a general oracle inequality. Then we apply this oracle inequality to least squares regression using Gaussian kernels and deduce local learning rates that are essentially minimax optimal under some standard smoothness assumptions on the regression function. This gives the first motivation for using local SVMs that is not based on computational requirements but on theoretical predictions on the generalization performance. We further introduce a data-dependent parameter selection method for our local SVM approach and show that this method achieves the same learning rates as before. Finally, we present some larger scale experiments for our localized SVM showing that it achieves essentially the same test performance as a global SVM for a fraction of the computational requirements. In addition, it turns out that the computational requirements for the local SVMs are similar to those of a vanilla random chunk approach, while the achieved test errors are significantly better.

artificial intelligence, machine learning, rc-svm, (16 more...)

1507.06615

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.45)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Fawzi, Alhussein, Frossard, Pascal

Manitest: Are classifiers really invariant?

arXiv.org Machine LearningJul-23-2015

Invariance to geometric transformations is a highly desirable property of automatic classifiers in many image recognition tasks. Nevertheless, it is unclear to which extent state-of-the-art classifiers are invariant to basic transformations such as rotations and translations. This is mainly due to the lack of general methods that properly measure such an invariance. In this paper, we propose a rigorous and systematic approach for quantifying the invariance to geometric transformations of any classifier. Our key idea is to cast the problem of assessing a classifier's invariance as the computation of geodesics along the manifold of transformed images. We propose the Manitest method, built on the efficient Fast Marching algorithm to compute the invariance of classifiers. Our new method quantifies in particular the importance of data augmentation for learning invariance from data, and the increased invariance of convolutional neural networks with depth. We foresee that the proposed generic tool for measuring invariance to a large class of geometric transformations and arbitrary classifiers will have many applications for evaluating and comparing classifiers based on their invariance, and help improving the invariance of existing classifiers.

classifier, machine learning, pattern recognition, (19 more...)

1507.06535

Country:

North America (0.28)
Europe (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.34)

An Efficient Classifier Based on Hierarchical Mixing Linear Support Vector Machines

Wang, Di (Wenzhou University) | Zhang, Xiaoqin (Wenzhou University) | Fan, Mingyu (Wenzhou University) | Ye, Xiuzi (Wenzhou University)

SVM in advance, and this limits their applications to largescale problems. To address this issue, several methods for Support vector machines (SVMs) play a very dominant selecting a set of basis vectors are proposed. They include role in data classification due to their good sampling from the training set in the Nystrom method generalization performance. However, they suffer [Williams and Seeger, 2001] and variants of the Incomplete from the high computational complexity in the Cholesky factorization [Bach and Jordan, 2005], core vector classification phase when there are a considerable machine (CVM) [Tsang et al., 2005], relevance vector machine number of support vectors (SVs). Then it is desirable (RVM)[Tipping, 2001], and relevance units machine to design efficient algorithms in the classification (RUM)[Gao and Zhang, 2009]. Wu et al. [Wu et al., 2006] phase to deal with the datasets of realtime add one constraint on the number of basis vectors to the standard pattern recognition systems. To this end, we SVM optimization problem, and then solve this modified propose a novel classifier called HMLSVMs (Hierarchical nonconvex problem to build sparse kernel learning algorithms Mixing Linear Support Vector Machines) (SKLA). Joachims and Yu [Joachims and Yu, 2009] in this paper, which has a hierarchical structure explore a new sparse kernel SVMs via cutting plane training, with a mixing linear SVMs classifier at each node called cutting-plane subspace pursuit (CPSP).Although and predicts the label of a sample using only a the above methods prunes the SVs and reduces computational few hyperplanes. We also give a generalization complexity in classification phase, when a new test sample is error bound for the class of locally linear SVMs introduced, they still need to compare it with these pruned (LLSVMs) based on the Rademacher theory, which SVs via kernel calculations to predict the label of the test ensures that overfitting can be effectively avoided.

classifier, complexity, hmlsvm, (14 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.24)
Asia > China > Zhejiang Province (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Villela, Saulo Moraes (Federal University of Juiz de Fora) | Leite, Saul de Castro (Federal University of Juiz de Fora) | Neto, Raul Fonseca (Federal University of Juiz de Fora)

Feature Selection from Microarray Data via an Ordered Search with Projected Margin

Microarray experiments are capable of measuring the expression level of thousands of genes simultaneously. Dealing with this enormous amount of information requires complex computation. Support Vector Machines (SVM) have been widely used with great efficiency to solve classification problems that have high dimension. In this sense, it is plausible to develop new feature selection strategies for microarray data that are associated with this type of classifier. Therefore, we propose, in this paper, a new method for feature selection based on an ordered search process to explore the space of possible subsets. The algorithm, called Admissible Ordered Search (AOS), uses as evaluation function the margin values estimated for each hypothesis by a SVM classifier. An important theoretical contribution of this paper is the development of the projected margin concept. This value is computed as the margin vector projection on a lower dimensional subspace and is used as an upper bound for the current value of the hypothesis in the search process. This enables great economy in runtime and consequently efficiency in the search process as a whole. The algorithm was tested using five different microarray data sets yielding superior results when compared to three representative feature selection methods.

algorithm, hypothesis, subset, (15 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

South America > Brazil > Minas Gerais (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.89)

Optimizing Locally Linear Classifiers with Supervised Anchor Point Learning

Mao, Xue (Chinese Academy of Sciences) | Fu, Zhouyu (University of Western Sydney) | Wu, Ou (Chinese Academy of Sciences) | Hu, Weiming (Chinese Academy of Sciences)

Kernel SVM suffers from high computational complexity when dealing with large-scale nonlinear datasets. To address this issue, locally linear classifiers have been proposed for approximating nonlinear decision boundaries with locally linear functions using a local coding scheme. The effectiveness of such coding scheme depends heavily on the quality of anchor points chosen to produce the local codes. Existing methods usually involve a phase of unsupervised anchor point learning followed by supervised classifier learning. Thus, the anchor points and classifiers are obtained separately whereas the learned anchor points may not be optimal for the discriminative task. In this paper, we present a novel fully supervised approach for anchor point learning. A single optimization problem is formulated over both anchor point and classifier variables, optimizing the initial anchor points jointly with the classifiers to minimize the classification risk. Experimental results show that our method outperforms other competitive methods which employ unsupervised anchor point learning and achieves performance on par with the kernel SVM albeit with much improved efficiency.

anchor point, classifier, dataset, (17 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.72)

Li, Xiang (University of Western Ontario and National University of Defense Technology) | Wang, Huaimin (National University of Defense Technology) | Gu, Bin (Nanjing University of Information Science Technology and University of Western Ontario) | Ling, Charles X. (University of Western Ontario)

Data Sparseness in Linear SVM

Large sparse datasets are common in many real-world applications. Linear SVM has been shown to be very efficient for classifying such datasets. However, it is still unknown how data sparseness would affect its convergence behavior. To study this problem in a systematic manner, we propose a novel approach to generate large and sparse data from real-world datasets, using statistical inference and the data sampling process in the PAC framework. We first study the convergence behavior of linear SVM experimentally, and make several observations, useful for real-world applications. We then offer theoretical proofs for our observations by studying the Bayes risk and PAC bound. Our experiment and theoretic results are valuable for learning large sparse datasets with linear SVM.

data sparseness, dataset, sparseness, (15 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

North America > Canada > Ontario (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Pre-release Prediction of Crowd Opinion on Movies by Label Distribution Learning

Geng, Xin (Southeast University) | Hou, Peng (Southeast University)

This paper studies an interesting problem: is it possible to predict the crowd opinion about a movie before the movie is actually released? The crowd opinion is here expressed by the distribution of ratings given by a sufficient amount of people. Consequently, the pre-release crowd opinion prediction can be regarded as a Label Distribution Learning (LDL) problem. In order to solve this problem, a Label Distribution Support Vector Regressor (LDSVR) is proposed in this paper. The basic idea of LDSVR is to fit a sigmoid function to each component of the label distribution simultaneously by a multi-output support vector machine. Experimental results show that LDSVR can accurately predict peoples’s rating distribution about a movie just based on the pre-release metadata of the movie.

insensitive zone, ldsvr, movie, (12 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > Colorado > Denver County > Denver (0.04)
(9 more...)

Genre:

Research Report (0.68)
Overview (0.66)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.91)

Modeling Inter- and Intra-Part Deformations for Object Structure Parsing

Cai, Ling (Xiamen University) | Ji, Rongrong (Xiamen University) | Liu, Wei (IBM T. J. Watson Research Center) | Hua, Gang (Stevens Institute of Technology)

Part deformation has been a longstanding challenge for object parsing, of which the primary difficulty lies in modeling the highly diverse object structures. To this end, we propose a novel structure parsing model to capture deformable object structures. The proposed model consists of two de-formable layers: the top layer is an undirected graph that incorporates inter-part deformations to infer object structures; the base layer is consisted of various independent nodes to characterize local intra-part deformations. To learn this two-layer model, we design a layer-wise learning algorithm,which employs matching pursuit and belief propagation for a low computational complexity inference. Specifically, active basis sparse coding is leveraged to build the nodes at the base layer, while the edge weights are estimated by a structural support vector machine. Experimental results on two benchmark datasets (i.e., faces and horses) demonstrate that the proposed model yields superior parsing performance over state-of-the-art models.

deformation, inter-part deformation, node, (17 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

North America > United States (0.14)
Asia > China > Fujian Province > Xiamen (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)