Goto

Collaborating Authors

 Support Vector Machines


Is Deep Learning Necessary For Simple Classification Tasks

#artificialintelligence

Deep learning (DL) models are known for tackling the nonlinearities associated with data, which the traditional estimators such as logistic regression couldn't. However, there is still a cloud of doubt with regards to the increased use of computationally intensive DL for simple classification tasks. To find out if DL really outperforms shallow models significantly, the researchers from the University of Pennsylvania experiment with three ML pipelines that involve traditional methods, AutoML and DL in a paper titled, 'Is Deep Learning Necessary For Simple Classification Tasks.' The UPenn researchers stated that a support-vector machine (SVM) model might predict more accurately susceptibility to a certain complex genetic disease than a gradient boosting model trained on the same dataset. Moreover, choosing different hyperparameters within that SVM model can vary performances.


Artificial Intelligence Masterclass

#artificialintelligence

Online Courses Udemy Enter the new era of Hybrid AI Models optimized by Deep NeuroEvolution, with a complete toolkit of ML, DL & AI models Created by Hadelin de Ponteves, Kirill Eremenko, SuperDataScience Team English, Italian [Auto-generated] Students also bought Artificial Intelligence: Reinforcement Learning in Python Machine Learning and AI: Support Vector Machines in Python Advanced AI: Deep Reinforcement Learning in Python Ensemble Machine Learning in Python: Random Forest, AdaBoost Deep Learning: Advanced Computer Vision (GANs, SSD, More!) Preview this course GET COUPON CODE Description Today, we are bringing you the king of our AI courses...: The Artificial Intelligence MASTERCLASS Are you keen on Artificial Intelligence? Do want to learn to build the most powerful AI model developed so far and even play against it? Sounds tempting right... Then Artificial Intelligence Masterclass course is the right choice for you. This ultimate AI toolbox is all you need to nail it down with ease. You will get 10 hours step by step guide and the full roadmap which will help you build your own Hybrid AI Model from scratch.


Modeling and Uncertainty Analysis of Groundwater Level Using Six Evolutionary Optimization Algorithms Hybridized with ANFIS, SVM, and ANN

arXiv.org Machine Learning

In the present study, six meta-heuristic schemes are hybridized with artificial neural network (ANN), adaptive neuro-fuzzy interface system (ANFIS), and support vector machine (SVM), to predict monthly groundwater level (GWL), evaluate uncertainty analysis of predictions and spatial variation analysis. The six schemes, including grasshopper optimization algorithm (GOA), cat swarm optimization (CSO), weed algorithm (WA), genetic algorithm (GA), krill algorithm (KA), and particle swarm optimization (PSO), were used to hybridize for improving the performance of ANN, SVM, and ANFIS models. Groundwater level (GWL) data of Ardebil plain (Iran) for a period of 144 months were selected to evaluate the hybrid models. The pre-processing technique of principal component analysis (PCA) was applied to reduce input combinations from monthly time series up to 12-month prediction intervals. The results showed that the ANFIS-GOA was superior to the other hybrid models for predicting GWL in the first piezometer and third piezometer in the testing stage. The performance of hybrid models with optimization algorithms was far better than that of classical ANN, ANFIS, and SVM models without hybridization. The percent of improvements in the ANFIS-GOA versus standalone ANFIS in piezometer 10 were 14.4%, 3%, 17.8%, and 181% for RMSE, MAE, NSE, and PBIAS in the training stage and 40.7%, 55%, 25%, and 132% in testing stage, respectively. The improvements for piezometer 6 in train step were 15%, 4%, 13%, and 208% and in the test step were 33%, 44.6%, 16.3%, and 173%, respectively, that clearly confirm the superiority of developed hybridization schemes in GWL modeling. Uncertainty analysis showed that ANFIS-GOA and SVM had, respectively, the best and worst performances among other models. In general, GOA enhanced the accuracy of the ANFIS, ANN, and SVM models.


K-Nearest Neighbour and Support Vector Machine Hybrid Classification

arXiv.org Machine Learning

In this paper, a novel K-Nearest Neighbour and Support Vector Machine hybrid classification technique has been proposed that is simple and robust. It is based on the concept of discriminative nearest neighbourhood classification. The technique consists of using K-Nearest Neighbour Classification for test samples satisfying a proximity condition. The patterns which do not pass the proximity condition are separated. This is followed by sifting the training set for a fixed number of patterns for every class which are closest to each separated test pattern respectively, based on the Euclidean distance metric. Subsequently, for every separated test sample, a Support Vector Machine is trained on the sifted training set patterns associated with it, and classification for the test sample is done. The proposed technique has been compared to the state of art in this research area. Three datasets viz. the United States Postal Service (USPS) Handwritten Digit Dataset, MNIST Dataset, and an Arabic numeral dataset, the Modified Arabic Digits Database, MADB, have been used to evaluate the performance of the algorithm. The algorithm generally outperforms the other algorithms with which it has been compared.


Green Machine Learning via Augmented Gaussian Processes and Multi-Information Source Optimization

arXiv.org Machine Learning

Searching for accurate Machine and Deep Learning models is a computationally expensive and awfully energivorous process. A strategy which has been gaining recently importance to drastically reduce computational time and energy consumed is to exploit the availability of different information sources, with different computational costs and different "fidelity", typically smaller portions of a large dataset. The multi-source optimization strategy fits into the scheme of Gaussian Process based Bayesian Optimization. An Augmented Gaussian Process method exploiting multiple information sources (namely, AGP-MISO) is proposed. The Augmented Gaussian Process is trained using only "reliable" information among available sources. A novel acquisition function is defined according to the Augmented Gaussian Process. Computational results are reported related to the optimization of the hyperparameters of a Support Vector Machine (SVM) classifier using two sources: a large dataset - the most expensive one - and a smaller portion of it. A comparison with a traditional Bayesian Optimization approach to optimize the hyperparameters of the SVM classifier on the large dataset only is reported.


Classification Under Human Assistance

arXiv.org Machine Learning

Most supervised learning models are trained for full automation. However, their predictions are sometimes worse than those by human experts on some specific instances. Motivated by this empirical observation, our goal is to design classifiers that are optimized to operate under different automation levels. More specifically, we focus on convex margin-based classifiers and first show that the problem is NP-hard. Then, we further show that, for support vector machines, the corresponding objective function can be expressed as the difference of two functions f = g - c, where g is monotone, non-negative and {\gamma}-weakly submodular, and c is non-negative and modular. This representation allows a recently introduced deterministic greedy algorithm, as well as a more efficient randomized variant of the algorithm, to enjoy approximation guarantees at solving the problem. Experiments on synthetic and real-world data from several applications in medical diagnosis illustrate our theoretical findings and demonstrate that, under human assistance, supervised learning models trained to operate under different automation levels can outperform those trained for full automation as well as humans operating alone.


Performance Evaluation of t-SNE and MDS Dimensionality Reduction Techniques with KNN, ENN and SVM Classifiers

arXiv.org Machine Learning

The central goal of this paper is to establish two commonly available dimensionality reduction (DR) methods i.e. t-distributed Stochastic Neighbor Embedding (t-SNE) and Multidimensional Scaling (MDS) in Matlab and to observe their application in several datasets. These DR techniques are applied to nine different datasets namely CNAE9, Segmentation, Seeds, Pima Indians diabetes, Parkinsons, Movement Libras, Mammographic Masses, Knowledge, and Ionosphere acquired from UCI machine learning repository. By applying t-SNE and MDS algorithms, each dataset is transformed to the half of its original dimension by eliminating unnecessary features from the datasets. Subsequently, these datasets with reduced dimensions are fed into three supervised classification algorithms for classification. These classification algorithms are K Nearest Neighbors (KNN), Extended Nearest Neighbors (ENN), and Support Vector Machine (SVM). Again, all these algorithms are implemented in Matlab. The training and test data ratios are maintained as ninety percent: ten percent for each dataset. Upon accuracy observation, the efficiency for every dimensionality technique with availed classification algorithms is analyzed and the performance of each classifier is evaluated.


Towards Threshold Invariant Fair Classification

arXiv.org Machine Learning

Effective machine learning models can automatically learn useful information from a large quantity of data and provide decisions in a high accuracy. These models may, however, lead to unfair predictions in certain sense among the population groups of interest, where the grouping is based on such sensitive attributes as race and gender. Various fairness definitions, such as demographic parity and equalized odds, were proposed in prior art to ensure that decisions guided by the machine learning models are equitable. Unfortunately, the "fair" model trained with these fairness definitions is threshold sensitive, i.e., the condition of fairness may no longer hold true when tuning the decision threshold. This paper introduces the notion of threshold invariant fairness, which enforces equitable performances across different groups independent of the decision threshold. To achieve this goal, this paper proposes to equalize the risk distributions among the groups via two approximation methods. Experimental results demonstrate that the proposed methodology is effective to alleviate the threshold sensitivity in machine learning models designed to achieve fairness.


Artificial Musical Intelligence: A Survey

arXiv.org Artificial Intelligence

Computers have been used to analyze and create music since they were first introduced in the 1950s and 1960s. Beginning in the late 1990s, the rise of the Internet and large scale platforms for music recommendation and retrieval have made music an increasingly prevalent domain of machine learning and artificial intelligence research. While still nascent, several different approaches have been employed to tackle what may broadly be referred to as "musical intelligence." This article provides a definition of musical intelligence, introduces a taxonomy of its constituent components, and surveys the wide range of AI methods that can be, and have been, brought to bear in its pursuit, with a particular emphasis on machine learning methods.


Robust Persistence Diagrams using Reproducing Kernels

arXiv.org Machine Learning

Persistent homology has become an important tool for extracting geometric and topological features from data, whose multi-scale features are summarized in a persistence diagram. From a statistical perspective, however, persistence diagrams are very sensitive to perturbations in the input space. In this work, we develop a framework for constructing robust persistence diagrams from superlevel filtrations of robust density estimators constructed using reproducing kernels. Using an analogue of the influence function on the space of persistence diagrams, we establish the proposed framework to be less sensitive to outliers. The robust persistence diagrams are shown to be consistent estimators in bottleneck distance, with the convergence rate controlled by the smoothness of the kernel. This, in turn, allows us to construct uniform confidence bands in the space of persistence diagrams. Finally, we demonstrate the superiority of the proposed approach on benchmark datasets.