Goto

Collaborating Authors

 Support Vector Machines


Proof-of-concept system uses smart speakers to catch signs of cardiac arrest

#artificialintelligence

In an effort to tackle in-home cardiac arrest, University of Washington researchers have devised a novel contactless system that uses smartphones or voice-based personal assistants to identify telltale breathing patterns that accompany an attack. The proof-of-concept strategy, described in an NPJ Digital Medicine paper published this morning, involved a supervised machine learning model called a support-vector machine that was trained for use in the bedroom, a controlled environment in which the majority of in-home cardiac arrests occur. "Sometimes reported as'gasping' breaths, agonal respirations may hold potential as an audible diagnostic biomarker, particularly in unwitnessed cardiac arrests that occur in a private residence, the location of [two-thirds] of all [out-of-hospital cardiac arrests]," the researchers wrote. "The widespread adoption of smartphones and smart speakers (projected to be in 75% of US households by 2020) presents a unique opportunity to identify this audible biomarker and connect unwitnessed cardiac arrest victims to emergency medical services (EMS) or others who can administer cardiopulmonary resuscitation." Cross-validation analysis of the trained classifier yielded an overall sensitivity and specificity of 97.24% and 99.51%.


Quantum-Inspired Support Vector Machine

arXiv.org Machine Learning

Support vector machine (SVM) is a particularly powerful and flexible supervised learning model that analyze data for both classification and regression, whose usual complexity scales polynomially with the dimension and number of data points. Inspired by the quantum SVM, we present a quantum-inspired classical algorithm for SVM using fast sampling techniques. In our approach, we develop a general method to approximately calculate the kernel function and make classification via carefully sampling the data matrix, thus our approach can be applied to various types of SVM, such as linear SVM, poly-kernel SVM and soft SVM. Theoretical analysis shows one can find the supported hyperplanes on a data set which we have sampling access, and thus make classification with arbitrary success probability in logarithmic runtime, matching the runtime of the quantum SVM.


Support vector machines on the D-Wave quantum annealer

arXiv.org Machine Learning

Kernel-based support vector machines (SVMs) are supervised machine learning algorithms for classification and regression problems. We present a method to train SVMs on a D-Wave 2000Q quantum annealer and study its performance in comparison to SVMs trained on conventional computers. The method is applied to both synthetic data and real data obtained from biology experiments. We find that the quantum annealer produces an ensemble of different solutions that often generalizes better to unseen data than the single global minimum of an SVM trained on a conventional computer, especially in cases where only limited training data is available. For cases with more training data than currently fits on the quantum annealer, we show that a combination of classifiers for subsets of the data almost always produces stronger joint classifiers than the conventional SVM for the same parameters.


Persistent homology detects curvature

arXiv.org Machine Learning

In topological data analysis, persistent homology is used to study the "shape of data". Persistent homology computations are completely characterized by a set of intervals called a bar code. It is often said that the long intervals represent the "topological signal" and the short intervals represent "noise". We give evidence to dispute this thesis, showing that the short intervals encode geometric information. Specifically, we prove that persistent homology detects the curvature of disks from which points have been sampled. We describe a general computational framework for solving inverse problems using the average persistence landscape, a continuous mapping from metric spaces with a probability measure to a Hilbert space. In the present application, the average persistence landscapes of points sampled from disks of constant curvature results in a path in this Hilbert space which may be learned using standard tools from statistical and machine learning.


A meta-learning recommender system for hyperparameter tuning: predicting when tuning improves SVM classifiers

arXiv.org Machine Learning

For many machine learning algorithms, predictive performance is critically affected by the hyperparameter values used to train them. However, tuning these hyperparameters can come at a high computational cost, especially on larger datasets, while the tuned settings do not always significantly outperform the default values. This paper proposes a recommender system based on meta-learning to identify exactly when it is better to use default values and when to tune hyperparameters for each new dataset. Besides, an in-depth analysis is performed to understand what they take into account for their decisions, providing useful insights. An extensive analysis of different categories of meta-features, meta-learners, and setups across 156 datasets is performed. Results show that it is possible to accurately predict when tuning will significantly improve the performance of the induced models. The proposed system reduces the time spent on optimization processes, without reducing the predictive performance of the induced models (when compared with the ones obtained using tuned hyperparameters). We also explain the decision-making process of the meta-learners in terms of linear separability-based hypotheses. Although this analysis is focused on the tuning of Support Vector Machines, it can also be applied to other algorithms, as shown in experiments performed with decision trees.


Medium-Term Load Forecasting Using Support Vector Regression, Feature Selection, and Symbiotic Organism Search Optimization

arXiv.org Machine Learning

An accurate load forecasting has always been one of the main indispensable parts in the operation and planning of power systems. Among different time horizons of forecasting, while short-term load forecasting (STLF) and long-term load forecasting (LTLF) have respectively got benefits of accurate predictors and probabilistic forecasting, medium-term load forecasting (MTLF) demands more attention due to its vital role in power system operation and planning such as optimal scheduling of generation units, robust planning program for customer service, and economic supply. In this study, a hybrid method, composed of Support Vector Regression (SVR) and Symbiotic Organism Search Optimization (SOSO) method, is proposed for MTLF. In the proposed forecasting model, SVR is the main part of the forecasting algorithm while SOSO is embedded into it to optimize the parameters of SVR. In addition, a minimum redundancy-maximum relevance feature selection algorithm is used to in the preprocessing of input data. The proposed method is tested on EUNITE competition dataset to demonstrate its proper performance. Furthermore, it is compared with some previous works to show eligibility of our method.


Making Classical Machine Learning Pipelines Differentiable: A Neural Translation Approach

arXiv.org Machine Learning

Classical Machine Learning (ML) pipelines often comprise of multiple ML models where models, within a pipeline, are trained in isolation. Conversely, when training neural network models, layers composing the neural models are simultaneously trained using backpropagation. We argue that the isolated training scheme of ML pipelines is sub-optimal, since it cannot jointly optimize multiple components. To this end, we propose a framework that translates a pre-trained ML pipeline into a neural network and fine-tunes the ML models within the pipeline jointly using backpropagation. Our experiments show that fine-tuning of the translated pipelines is a promising technique able to increase the final accuracy.


Validating the Validation: Reanalyzing a large-scale comparison of Deep Learning and Machine Learning models for bioactivity prediction

arXiv.org Machine Learning

Machine learning methods may have the potential to significantly accelerate drug discovery. However, the increasing rate of new methodological approaches being published in the literature raises the fundamental question of how models should be benchmarked and validated. We reanalyze the data generated by a recently published large-scale comparison of machine learning models for bioactivity prediction and arrive at a somewhat different conclusion. We show that the performance of support vector machines is competitive with that of deep learning methods. Additionally, using a series of numerical experiments, we question the relevance of area under the receiver operating characteristic curve as a metric in virtual screening, and instead suggest that area under the precision-recall curve should be used in conjunction with the receiver operating characteristic. Our numerical experiments also highlight challenges in estimating the uncertainty in model performance via scaffold-split nested cross validation.


Automatically Evaluating Balance: A Machine Learning Approach

arXiv.org Machine Learning

Compared to in-clinic balance training, in-home training is not as effective. This is, in part, due to the lack of feedback from physical therapists (PTs). Here, we analyze the feasibility of using trunk sway data and machine learning (ML) techniques to automatically evaluate balance, providing accurate assessments outside of the clinic. We recruited sixteen participants to perform standing balance exercises. For each exercise, we recorded trunk sway data and had a PT rate balance performance on a scale of 1 to 5. The rating scale was adapted from the Functional Independence Measure. From the trunk sway data, we extracted a 61-dimensional feature vector representing performance of each exercise. Given these labeled data, we trained a multi-class support vector machine (SVM) to map trunk sway features to PT ratings. Evaluated in a leave-one-participant-out scheme, the model achieved a classification accuracy of 82%. Compared to participant self-assessment ratings, the SVM outputs were significantly closer to PT ratings. The results of this pilot study suggest that in the absence of PTs, ML techniques can provide accurate assessments during standing balance exercises. Such automated assessments could reduce PT consultation time and increase user compliance outside of the clinic.


Selecting Biomarkers for building optimal treatment selection rules using Kernel Machines

arXiv.org Machine Learning

Optimal biomarker combinations for treatment-selection can be derived by minimizing total burden to the population caused by the targeted disease and its treatment. However, when multiple biomarkers are present, including all in the model can be expensive and hurt model performance. To remedy this, we consider feature selection in optimization by minimizing an extended total burden that additionally incorporates biomarker measurement costs. Formulating it as a 0-norm penalized weighted classification, we develop various procedures for estimating linear and nonlinear combinations. Through simulations and a real data example, we demonstrate the importance of incorporating feature-selection and marker cost when deriving treatment-selection rules.