Goto

Collaborating Authors

 Support Vector Machines


Large dimensional analysis of general margin based classification methods

arXiv.org Machine Learning

Margin-based classifiers have been popular in both machine learning and statistics for classification problems. Since a large number of classifiers are available, one natural question is which type of classifiers should be used given a particular classification task. We aim to answering this question by investigating the asymptotic performance of a family of large-margin classifiers in situations where the data dimension $p$ and the sample $n$ are both large. This family covers a broad range of classifiers including support vector machine, distance weighted discrimination, penalized logistic regression, and large-margin unified machine as special cases. The asymptotic results are described by a set of nonlinear equations and we observe a close match of them with Monte Carlo simulation on finite data samples. Our analytical studies shed new light on how to select the best classifier among various classification methods as well as on how to choose the optimal tuning parameters for a given method.


A Comparative Analysis of Android Malware

arXiv.org Machine Learning

In this paper, we present a comparative analysis of benign and malicious Android applications, based on static features. In particular, we focus our attention on the permissions requested by an application. We consider both binary classification of malware versus benign, as well as the multiclass problem, where we classify malware samples into their respective families. Our experiments are based on substantial malware datasets and we employ a wide variety of machine learning techniques, including decision trees and random forests, support vector machines, logistic model trees, AdaBoost, and artificial neural networks. We find that permissions are a strong feature and that by careful feature engineering, we can significantly reduce the number of features needed for highly accurate detection and classification.


Prediction of higher-selectivity catalysts by computer-driven workflow and machine learning

Science

To demonstrate the viability of our method, we predicted reaction outcomes with substrate combinations and catalysts different from the training data and simulated a situation in which highly selective reactions had not been achieved. In the first demonstration, a model was constructed by using support vector machines and validated with three different external test sets. The first test set evaluated the ability of the model to predict the selectivity of only reactions forming new products with catalysts from the training set. The model performed well, with a mean absolute deviation (MAD) of 0.161 kcal/mol. Next, the same model was used to predict the selectivity of an external test set of catalysts with substrate combinations from the training set.


Prediction of remaining service life of pavement using an optimized support vector machine (case study of Semnan–Firuzkuh road)

#artificialintelligence

Estimation of the prerequisites for the maintenance, repair, rehabilitation and reconstruction of pavement is one of the requirements for the design and maintenance of the structure of pavement. The pavement design methods are based on providing a proper prediction of the structure of pavement to keep it in permissible condition. The term'remaining service life' (RSL) refers to the time it takes for the pavement to reach an unacceptable status and need to be rehabilitated or reconstructed (Elkins, Thompson, Groerger, Visintine, & Rada, 2013 Elkins, G. E., Thompson, T. M., Groerger, J. L., Visintine, B., & Rada, G. R. (2013). Prediction of the RSL is a basic concept of pavement maintenance planning. Awareness of the future conditions of pavement is a key point in making decisions in the planning of pavement maintenance. On the other hand, we know that pavement optimization methods are urgently needed to predict changes in pavement conditions over a defined period of time.


A GA-based feature selection of the EEG signals by classification evaluation: Application in BCI systems

arXiv.org Machine Learning

In electroencephalogram (EEG) signal processing, finding the appropriate information from a dataset has been a big challenge for successful signal classification. The feature selection methods make it possible to solve this problem; however, the method selection is still under investigation to find out which feature can perform the best to extract the most proper features of the signal to improve the classification performance. In this study, we use the genetic algorithm (GA), a heuristic searching algorithm, to find the optimum combination of the feature extraction methods and the classifiers, in the brain-computer interface (BCI) applications. A BCI system can be practical if and only if it performs with high accuracy and high speed alongside each other. In the proposed method, GA performs as a searching engine to find the best combination of the features and classifications. The features used here are Katz, Higuchi, Petrosian, Sevcik, and box-counting dimension (BCD) feature extraction methods. These features are applied to the wavelet subbands and are classified with four classifiers such as adaptive neuro-fuzzy inference system (ANFIS), fuzzy k-nearest neighbors (FKNN), support vector machine (SVM) and linear discriminant analysis (LDA). Due to the huge number of features, the GA optimization is used to find the features with the optimum fitness value (FV). Results reveal that Katz fractal feature estimation method with LDA classification has the best FV. Consequently, due to the low computation time of the first Daubechies wavelet transformation in comparison to the original signal, the final selected methods contain the fractal features of the first coefficient of the detail subbands.


Enhancing Explainability of Neural Networks through Architecture Constraints

arXiv.org Machine Learning

Prediction accuracy and model explainability are the two most important objectives when developing machine learning algorithms to solve real-world problems. The neural networks are known to possess good prediction performance, but lack of sufficient model explainability. In this paper, we propose to enhance the explainability of neural networks through the following architecture constraints: a) sparse additive subnetworks; b) orthogonal projection pursuit; and c) smooth function approximation. It leads to a sparse, orthogonal and smooth explainable neural network (SOSxNN). The multiple parameters in the SOSxNN model are simultaneously estimated by a modified mini-batch gradient descent algorithm based on the backpropagation technique for calculating the derivatives and the Cayley transform for preserving the projection orthogonality. The hyperparameters controlling the sparse and smooth constraints are optimized by the grid search. Through simulation studies, we compare the SOSxNN method to several benchmark methods including least absolute shrinkage and selection operator, support vector machine, random forest, and multi-layer perceptron. It is shown that proposed model keeps the flexibility of pursuing prediction accuracy while attaining the improved interpretability, which can be therefore used as a promising surrogate model for complex model approximation. Finally, the real data example from the Lending Club is employed as a showcase of the SOSxNN application.


Optimizing Software Effort Estimation Models Using Firefly Algorithm

arXiv.org Artificial Intelligence

Software development effort estimation is considered a fundamental task for software development life cycle as well as for managing project cost, time and quality. Therefore, accurate estimation is a substantial factor in projects success and reducing the risks. In recent years, software effort estimation has received a considerable amount of attention from researchers and became a challenge for software industry. In the last two decades, many researchers and practitioners proposed statistical and machine learning-based models for software effort estimation. In this work, Firefly Algorithm is proposed as a metaheuristic optimization method for optimizing the parameters of three COCOMO-based models. These models include the basic COCOMO model and other two models proposed in the literature as extensions of the basic COCOMO model. The developed estimation models are evaluated using different evaluation metrics. Experimental results show high accuracy and significant error minimization of Firefly Algorithm over other metaheuristic optimization algorithms including Genetic Algorithms and Particle Swarm Optimization.


Audio Captcha Recognition Using RastaPLP Features by SVM

arXiv.org Machine Learning

Nowadays, CAPTCHAs are computer generated tests that human can pass but current computer systems can not. They have common usage in various web services in order to be able to detect a human from computer programs autonomously. In this way, owners can protect their web services from bots. In addition to visual CAPTCHAs which consist of distorted images, mostly test images, that a user must write some description about that image, there are a significant amount of audio CAPTCHAs as well. Briefly, audio CAPTCHAs are sound files which consist of human sound under heavy noise where the speaker pronounces a bunch of digits consecutively. Generally, in those sound files, there are some periodic and non-periodic noises to get difficult to recognize them with a program but not for a human listener. We gathered numerous randomly collected audio file to train and then test them using our SVM algorithm to be able to extract digits out of each conversation.


Analogy-Based Preference Learning with Kernels

arXiv.org Machine Learning

Building on a specific formalization of analogical relationships of the form "A relates to B as C relates to D", we establish a connection between two important subfields of artificial intelligence, namely analogical reasoning and kernel-based machine learning. More specifically, we show that so-called analogical proportions are closely connected to kernel functions on pairs of objects. Based on this result, we introduce the analogy kernel, which can be seen as a measure of how strongly four objects are in analogical relationship. As an application, we consider the problem of object ranking in the realm of preference learning, for which we develop a new method based on support vector machines trained with the analogy kernel. Our first experimental results for data sets from different domains (sports, education, tourism, etc.) are promising and suggest that our approach is competitive to state-of-the-art algorithms in terms of predictive accuracy.


Solving large-scale L1-regularized SVMs and cousins: the surprising effectiveness of column and constraint generation

arXiv.org Machine Learning

The linear Support Vector Machine (SVM) is one of the most popular binary classification techniques in machine learning. Motivated by applications in modern high dimensional statistics, we consider penalized SVM problems involving the minimization of a hinge-loss function with a convex sparsity-inducing regularizer such as: the L1-norm on the coefficients, its grouped generalization and the sorted L1-penalty (aka Slope). Each problem can be expressed as a Linear Program (LP) and is computationally challenging when the number of features and/or samples is large -- the current state of algorithms for these problems is rather nascent when compared to the usual L2-regularized linear SVM. To this end, we propose new computational algorithms for these LPs by bringing together techniques from (a) classical column (and constraint) generation methods and (b) first order methods for non-smooth convex optimization - techniques that are rarely used together for solving large scale LPs. These components have their respective strengths; and while they are found to be useful as separate entities, they have not been used together in the context of solving large scale LPs such as the ones studied herein. Our approach complements the strengths of (a) and (b) --- leading to a scheme that seems to outperform commercial solvers as well as specialized implementations for these problems by orders of magnitude. We present numerical results on a series of real and synthetic datasets demonstrating the surprising effectiveness of classic column/constraint generation methods in the context of challenging LP-based machine learning tasks.