Goto

Collaborating Authors

 Support Vector Machines


Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss

arXiv.org Machine Learning

Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based methods are observed to perform well in many supervised classification tasks. Towards understanding this phenomenon, we analyze the training and generalization behavior of infinitely wide two-layer neural networks with homogeneous activations. We show that the limits of the gradient flow on exponentially tailed losses can be fully characterized as a max-margin classifier in a certain non-Hilbertian space of functions. In presence of hidden low-dimensional structures, the resulting margin is independent of the ambiant dimension, which leads to strong generalization bounds. In contrast, training only the output layer implicitly solves a kernel support vector machine, which a priori does not enjoy such an adaptivity. Our analysis of training is non-quantitative in terms of running time but we prove computational guarantees in simplified settings by showing equivalences with online mirror descent. Finally, numerical experiments suggest that our analysis describes well the practical behavior of two-layer neural networks with ReLU activation and confirm the statistical benefits of this implicit bias.


1D CNN Based Network Intrusion Detection with Normalization on Imbalanced Data

arXiv.org Artificial Intelligence

Intrusion detection system (IDS) plays an essential role in computer networks protecting computing resources and data from outside attacks. Recent IDS faces challenges improving flexibility and efficiency of the IDS for unexpected and unpredictable attacks. Deep neural network (DNN) is considered popularly for complex systems to abstract features and learn as a machine learning technique. In this paper, we propose a deep learning approach for developing the efficient and flexible IDS using one-dimensional Convolutional Neural Network (1D-CNN). Two-dimensional CNN methods have shown remarkable performance in detecting objects of images in computer vision area. Meanwhile, the 1D-CNN can be used for supervised learning on time-series data. We establish a machine learning model based on the 1D-CNN by serializing Transmission Control Protocol/Internet Protocol (TCP/IP) packets in a predetermined time range as an invasion Internet traffic model for the IDS, where normal and abnormal network traffics are categorized and labeled for supervised learning in the 1D-CNN. We evaluated our model on UNSW\_NB15 IDS dataset to show the effectiveness of our method. For comparison study in performance, machine learning-based Random Forest (RF) and Support Vector Machine (SVM) models in addition to the 1D-CNN with various network parameters and architecture are exploited. In each experiment, the models are run up to 200 epochs with a learning rate in 0.0001 on imbalanced and balanced data. 1D-CNN and its variant architectures have outperformed compared to the classical machine learning classifiers. This is mainly due to the reason that CNN has the capability to extract high-level feature representations that represent the abstract form of low-level feature sets of network traffic connections.


Novel Meta-Heuristic Model for Discrimination between Iron Deficiency Anemia and B-Thalassemia with CBC Indices Based on Dynamic Harmony Search

arXiv.org Machine Learning

In recent decades, attention has been directed at anemia classification for various medical purposes, such as thalassemia screening and predicting iron deficiency anemia (IDA). In this study, a new method has been successfully tested for discrimination between IDA and \b{eta}-thalassemia trait (\b{eta}-TT). The method is based on a Dynamic Harmony Search (DHS). Complete blood count (CBC), a fast and inexpensive laboratory test, is used as the input of the system. Other models, such as a genetic programming method called structured representation on genetic algorithm in non-linear function fitting (STROGANOFF), an artificial neural network (ANN), an adaptive neuro-fuzzy inference system (ANFIS), a support vector machine (SVM), k-nearest neighbor (KNN), and certain traditional methods, are compared with the proposed method.


Tropical Support Vector Machine and its Applications to Phylogenomics

arXiv.org Machine Learning

Most data in genome-wide phylogenetic analysis (phylogenomics) is essentially multidimensional, posing a major challenge to human comprehension and computational analysis. Also, we cannot directly apply statistical learning models in data science to a set of phylogenetic trees since the space of phylogenetic trees is not Euclidean. In fact, the space of phylogenetic trees is a tropical Grassmannian in terms of max-plus algebra. Therefore, to classify multi-locus data sets for phylogenetic analysis, we propose tropical Support Vector Machines (SVMs) over the space of phylogenetic trees. Like classical SVMs, a tropical SVM is a discriminative classifier defined by the tropical hyperplane which maximizes the minimum tropical distance from data points to itself in order to separate these data points into open sectors. We show that we can formulate hard margin tropical SVMs and soft margin tropical SVMs as linear programming problems. In addition, we show the necessary and sufficient conditions for each data point to be separated and an explicit formula for the optimal solution for the feasible linear programming problem. Based on our theorems, we develop novel methods to compute tropical SVMs and computational experiments show our methods work well. We end this paper with open problems.


A review of machine learning applications in wildfire science and management

arXiv.org Machine Learning

Artificial intelligence has been applied in wildfire science and management since the 1990s, with early applications including neural networks and expert systems. Since then the field has rapidly progressed congruently with the wide adoption of machine learning (ML) in the environmental sciences. Here, we present a scoping review of ML in wildfire science and management. Our objective is to improve awareness of ML among wildfire scientists and managers, as well as illustrate the challenging range of problems in wildfire science available to data scientists. We first present an overview of popular ML approaches used in wildfire science to date, and then review their use in wildfire science within six problem domains: 1) fuels characterization, fire detection, and mapping; 2) fire weather and climate change; 3) fire occurrence, susceptibility, and risk; 4) fire behavior prediction; 5) fire effects; and 6) fire management. We also discuss the advantages and limitations of various ML approaches and identify opportunities for future advances in wildfire science and management within a data science context. We identified 298 relevant publications, where the most frequently used ML methods included random forests, MaxEnt, artificial neural networks, decision trees, support vector machines, and genetic algorithms. There exists opportunities to apply more current ML methods (e.g., deep learning and agent based learning) in wildfire science. However, despite the ability of ML models to learn on their own, expertise in wildfire science is necessary to ensure realistic modelling of fire processes across multiple scales, while the complexity of some ML methods requires sophisticated knowledge for their application. Finally, we stress that the wildfire research and management community plays an active role in providing relevant, high quality data for use by practitioners of ML methods.


Data Pre-Processing and Evaluating the Performance of Several Data Mining Methods for Predicting Irrigation Water Requirement

arXiv.org Artificial Intelligence

Recent drought and population growth are planting unprecedented demand for the use of available limited water resources. Irrigated agriculture is one of the major consumers of freshwater. A large amount of water in irrigated agriculture is wasted due to poor water management practices. To improve water management in irrigated areas, models for estimation of future water requirements are needed. Developing a model for forecasting irrigation water demand can improve water management practices and maximise water productivity. Data mining can be used effectively to build such models. In this study, we prepare a dataset containing information on suitable attributes for forecasting irrigation water demand. The data is obtained from three different sources namely meteorological data, remote sensing images and water delivery statements. In order to make the prepared dataset useful for demand forecasting and pattern extraction, we pre-process the dataset using a novel approach based on a combination of irrigation and data mining knowledge. We then apply and compare the effectiveness of different data mining methods namely decision tree (DT), artificial neural networks (ANNs), systematically developed forest (SysFor) for multiple trees, support vector machine (SVM), logistic regression, and the traditional Evapotranspiration (ETc) methods and evaluate the performance of these models to predict irrigation water demand. Our experimental results indicate the usefulness of data pre-processing and the effectiveness of different classifiers. Among the six methods we used, SysFor produces the best prediction with 97.5% accuracy followed by a decision tree with 96% and ANN with 95% respectively by closely matching the predictions with actual water usage. Therefore, we recommend using SysFor and DT models for irrigation water demand forecasting.


An End-to-End Graph Convolutional Kernel Support Vector Machine

arXiv.org Machine Learning

A novel kernel-based support vector machine (SVM) for graph classification is proposed. The SVM feature space mapping consists of a sequence of graph convolutional layers, which generates a vector space representation for each vertex, followed by a pooling layer which generates a reproducing kernel Hilbert space (RKHS) representation for the graph. The use of a RKHS offers the ability to implicitly operate in this space using a kernel function without the computational complexity of explicitly mapping into it. The proposed model is trained in a supervised end-to-end manner whereby the convolutional layers, the kernel function and SVM parameters are jointly optimized with respect to a regularized classification loss. This approach is distinct from existing kernel-based graph classification models which instead either use feature engineering or unsupervised learning to define the kernel function. Experimental results demonstrate that the proposed model outperforms existing deep learning baseline models on a number of datasets.


What Emotions Make One or Five Stars? Understanding Ratings of Online Product Reviews by Sentiment Analysis and XAI

arXiv.org Artificial Intelligence

When people buy products online, they primarily base their decisions on the recommendations of others given in online reviews. The current work analyzed these online reviews by sentiment analysis and used the extracted sentiments as features to predict the product ratings by several machine learning algorithms. These predictions were disentangled by various meth-ods of explainable AI (XAI) to understand whether the model showed any bias during prediction. Study 1 benchmarked these algorithms (knn, support vector machines, random forests, gradient boosting machines, XGBoost) and identified random forests and XGBoost as best algorithms for predicting the product ratings. In Study 2, the analysis of global feature importance identified the sentiment joy and the emotional valence negative as most predictive features. Two XAI visualization methods, local feature attributions and partial dependency plots, revealed several incorrect prediction mechanisms on the instance-level. Performing the benchmarking as classification, Study 3 identified a high no-information rate of 64.4% that indicated high class imbalance as underlying reason for the identified problems. In conclusion, good performance by machine learning algorithms must be taken with caution because the dataset, as encountered in this work, could be biased towards certain predictions. This work demonstrates how XAI methods reveal such prediction bias.


CODEBUG

#artificialintelligence

Support vector machine (SVM) is a supervised machine learning algorithm which is considered effective tool for both classification and regression problem. In a simple word, SVM tries to find a linearly separable hyperplane in order to separate members of one class from another. If SVM can not find the hyperplane for a given data set, it applies non-linear mapping to the training data and transform them to higher dimension where it searches for the optimal hyperplane. The SVM algorithm uses support vectors and margins in order to draw these hyperplanes in the training data. Since it has ability to understand the complex relation in input data by applying nonlinear mapping, it has high accuracy compare to other supervised classification algorithms (kNN, NCC..) People have been using SVM for different applications like: text data classification, image data(handwritten) recognition and more.


Predicting TUG score from gait characteristics based on video analysis and machine learning

arXiv.org Machine Learning

Fall is a leading cause of death which suffers the elderly and society. Timed Up and Go (TUG) test is a common tool for fall risk assessment. In this paper, we propose a method for predicting TUG score from gait characteristics extracted from video based on computer vision and machine learning technologies. First, 3D pose is estimated from video captured with 2D and 3D cameras during human motion and then a group of gait characteristics are computed from 3D pose series. After that, copula entropy is used to select those characteristics which are mostly associated with TUG score. Finally, the selected characteristics are fed into the predictive models to predict TUG score. Experiments on real world data demonstrated the effectiveness of the proposed method. As a byproduct, the associations between TUG score and several gait characteristics are discovered, which laid the scientific foundation of the proposed method and make the predictive models such built interpretable to clinical users.