Support Vector Machines
The Influence of Dataset Partitioning on Dysfluency Detection Systems
Bayerl, Sebastian P., Wagner, Dominik, Nรถth, Elmar, Bocklet, Tobias, Riedhammer, Korbinian
This paper empirically investigates the influence of different data splits and splitting strategies on the performance of dysfluency detection systems. For this, we perform experiments using wav2vec 2.0 models with a classification head as well as support vector machines (SVM) in conjunction with the features extracted from the wav2vec 2.0 model to detect dysfluencies. We train and evaluate the systems with different non-speaker-exclusive and speaker-exclusive splits of the Stuttering Events in Podcasts (SEP-28k) dataset to shed some light on the variability of results w.r.t. to the partition method used. Furthermore, we show that the SEP-28k dataset is dominated by only a few speakers, making it difficult to evaluate. To remedy this problem, we created SEP-28k-Extended (SEP-28k-E), containing semi-automatically generated speaker and gender information for the SEP-28k corpus, and suggest different data splits, each useful for evaluating other aspects of methods for dysfluency detection.
A Machine Learning Tutorial for Operational Meteorology, Part I: Traditional Machine Learning
Chase, Randy J., Harrison, David R., Burke, Amanda, Lackmann, Gary M., McGovern, Amy
Recently, the use of machine learning in meteorology has increased greatly. While many machine learning methods are not new, university classes on machine learning are largely unavailable to meteorology students and are not required to become a meteorologist. The lack of formal instruction has contributed to perception that machine learning methods are 'black boxes' and thus end-users are hesitant to apply the machine learning methods in their every day workflow. To reduce the opaqueness of machine learning methods and lower hesitancy towards machine learning in meteorology, this paper provides a survey of some of the most common machine learning methods. A familiar meteorological example is used to contextualize the machine learning methods while also discussing machine learning topics using plain language. The following machine learning methods are demonstrated: linear regression; logistic regression; decision trees; random forest; gradient boosted decision trees; naive Bayes; and support vector machines. Beyond discussing the different methods, the paper also contains discussions on the general machine learning process as well as best practices to enable readers to apply machine learning to their own datasets. Furthermore, all code (in the form of Jupyter notebooks and Google Colaboratory notebooks) used to make the examples in the paper is provided in an effort to catalyse the use of machine learning in meteorology.
Analysis, Characterization, Prediction and Attribution of Extreme Atmospheric Events with Machine Learning: a Review
Salcedo-Sanz, Sancho, Pรฉrez-Aracil, Jorge, Ascenso, Guido, Del Ser, Javier, Casillas-Pรฉrez, David, Kadow, Christopher, Fister, Dusan, Barriopedro, David, Garcรญa-Herrera, Ricardo, Restelli, Marcello, Giuliani, Mateo, Castelletti, Andrea
Atmospheric Extreme Events (EEs) cause severe damages to human societies and ecosystems. The frequency and intensity of EEs and other associated events are increasing in the current climate change and global warming risk. The accurate prediction, characterization, and attribution of atmospheric EEs is therefore a key research field, in which many groups are currently working by applying different methodologies and computational tools. Machine Learning (ML) methods have arisen in the last years as powerful techniques to tackle many of the problems related to atmospheric EEs. This paper reviews the ML algorithms applied to the analysis, characterization, prediction, and attribution of the most important atmospheric EEs. A summary of the most used ML techniques in this area, and a comprehensive critical review of literature related to ML in EEs, are provided. A number of examples is discussed and perspectives and outlooks on the field are drawn.
Statistical Learning
This is an introductory-level course in supervised learning, with a focus on regression and classification methods. The syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; tree-based methods, random forests and boosting; support-vector machines; neural networks and deep learning; survival models; multiple testing. Some unsupervised learning methods are discussed: principal components and clustering (k-means and hierarchical). This is not a math-heavy class, so we try and describe the methods without heavy reliance on formulas and complex mathematics. We focus on what we consider to be the important elements of modern data science.
Temporal support vectors for spiking neuronal networks
When neural circuits learn to perform a task, it is often the case that there are many sets of synaptic connections that are consistent with the task. However, only a small number of possible solutions are robust to noise in the input and are capable of generalizing their performance of the task to new inputs. Finding such good solutions is an important goal of learning systems in general and neuronal circuits in particular. For systems operating with static inputs and outputs, a well known approach to the problem is the large margin methods such as Support Vector Machines (SVM). By maximizing the distance of the data vectors from the decision surface, these solutions enjoy increased robustness to noise and enhanced generalization abilities. Furthermore, the use of the kernel method enables SVMs to perform classification tasks that require nonlinear decision surfaces. However, for dynamical systems with event based outputs, such as spiking neural networks and other continuous time threshold crossing systems, this optimality criterion is inapplicable due to the strong temporal correlations in their input and output. We introduce a novel extension of the static SVMs - The Temporal Support Vector Machine (T-SVM). The T-SVM finds a solution that maximizes a new construct - the dynamical margin. We show that T-SVM and its kernel extensions generate robust synaptic weight vectors in spiking neurons and enable their learning of tasks that require nonlinear spatial integration of synaptic inputs. We propose T-SVM with nonlinear kernels as a new model of the computational role of the nonlinearities and extensive morphologies of neuronal dendritic trees.
Feature subset selection for kernel SVM classification via mixed-integer optimization
Tamura, Ryuta, Takano, Yuichi, Miyashiro, Ryuhei
We study the mixed-integer optimization (MIO) approach to feature subset selection in nonlinear kernel support vector machines (SVMs) for binary classification. First proposed for linear regression in the 1970s, this approach has recently moved into the spotlight with advances in optimization algorithms and computer hardware. The goal of this paper is to establish an MIO approach for selecting the best subset of features for kernel SVM classification. To measure the performance of subset selection, we use the kernel-target alignment, which is the distance between the centroids of two response classes in a high-dimensional feature space. We propose a mixed-integer linear optimization (MILO) formulation based on the kernel-target alignment for feature subset selection, and this MILO problem can be solved to optimality using optimization software. We also derive a reduced version of the MILO problem to accelerate our MILO computations. Experimental results show good computational efficiency for our MILO formulation with the reduced problem. Moreover, our method can often outperform the linear-SVM-based MILO formulation and recursive feature elimination in prediction performance, especially when there are relatively few data instances.
Development and internal validation of a machine-learning-developed model for predicting 1-year mortality after fragility hip fracture - BMC Geriatrics
Fragility hip fracture increases morbidity and mortality in older adult patients, especially within the first year. Identification of patients at high risk of death facilitates modification of associated perioperative factors that can reduce mortality. Various machine learning algorithms have been developed and are widely used in healthcare research, particularly for mortality prediction. This study aimed to develop and internally validate 7 machine learning models to predict 1-year mortality after fragility hip fracture. This retrospective study included patients with fragility hip fractures from a single center (Siriraj Hospital, Bangkok, Thailand) from July 2016 to October 2018. A total of 492 patients were enrolled. They were randomly categorized into a training group (344 cases, 70%) or a testing group (148 cases, 30%). Various machine learning techniques were used: the Gradient Boosting Classifier (GB), Random Forests Classifier (RF), Artificial Neural Network Classifier (ANN), Logistic Regression Classifier (LR), Naive Bayes Classifier (NB), Support Vector Machine Classifier (SVM), and K-Nearest Neighbors Classifier (KNN). All models were internally validated by evaluating their performance and the area under a receiver operating characteristic curve (AUC). For the testing dataset, the accuracies were GB modelโ=โ0.93, RF modelโ=โ0.95, ANN modelโ=โ0.94, LR modelโ=โ0.91, NB modelโ=โ0.89, SVM modelโ=โ0.90, and KNN modelโ=โ0.90. All models achieved high AUCs that ranged between 0.81 and 0.99. The RF model also provided a negative predictive value of 0.96, a positive predictive value of 0.93, a specificity of 0.99, and a sensitivity of 0.68. Our machine learning approach facilitated the successful development of an accurate model to predict 1-year mortality after fragility hip fracture. Several machine learning algorithms (eg, Gradient Boosting and Random Forest) had the potential to provide high predictive performance based on the clinical parameters of each patient. The web application is available at www.hipprediction.com . External validation in a larger group of patients or in different hospital settings is warranted to evaluate the clinical utility of this tool. Thai Clinical Trials Registry (22 February 2021; reg. no. TCTR20210222003 ).
Pan-African Artificial Intelligence and Smart Systems
This book constitutes the refereed post-conference proceedings of the First International Conference on Pan-African Intelligence and Smart Systems, PAAISS 2021, which was held in Windhoek, Namibia, in September 2021. The 17 revised full papers presented were carefully selected from 41 submissions. The theme of PAAISS 2021 was "Advancing AI research in Africa" and the papers are arranged according to subject areas: Deep Learning; Classification and Pattern Recognition; Neural Networks and Support Vector Machines; Smart Systems.
CS229: Machine Learning - AI Summary
CS229: Machine Learning Course Description This course provides a broad introduction to machine learning and statistical pattern recognition. Topics include: supervised learning (generative/discriminative learning, parametric/non-parametric learning, neural networks, support vector machines); unsupervised learning (clustering, dimensionality reduction, kernel methods); learning theory (bias/variance tradeoffs, practical advice); reinforcement learning and adaptive control. The course will also discuss recent applications of machine learning, such as to robotic control, data mining, autonomous navigation, bioinformatics, speech recognition, and text and web data processing. Course Description This course provides a broad introduction to machine learning and statistical pattern recognition. Topics include: supervised learning (generative/discriminative learning, parametric/non-parametric learning, neural networks, support vector machines); unsupervised learning (clustering, dimensionality reduction, kernel methods); learning theory (bias/variance tradeoffs, practical advice); reinforcement learning and adaptive control.
Classification Auto-Encoder based Detector against Diverse Data Poisoning Attacks
Poisoning attacks are a category of adversarial machine learning threats in which an adversary attempts to subvert the outcome of the machine learning systems by injecting crafted data into training data set, thus increasing the machine learning model's test error. The adversary can tamper with the data feature space, data labels, or both, each leading to a different attack strategy with different strengths. Various detection approaches have recently emerged, each focusing on one attack strategy. The Achilles heel of many of these detection approaches is their dependence on having access to a clean, untampered data set. In this paper, we propose CAE, a Classification Auto-Encoder based detector against diverse poisoned data. CAE can detect all forms of poisoning attacks using a combination of reconstruction and classification errors without having any prior knowledge of the attack strategy. We show that an enhanced version of CAE (called CAE+) does not have to employ a clean data set to train the defense model. Our experimental results on three real datasets MNIST, Fashion-MNIST and CIFAR demonstrate that our proposed method can maintain its functionality under up to 30% contaminated data and help the defended SVM classifier to regain its best accuracy.