Goto

Collaborating Authors

 Support Vector Machines


A Computer-Aided System for Determining the Application Range of a Warfarin Clinical Dosing Algorithm Using Support Vector Machines with a Polynomial Kernel Function

arXiv.org Machine Learning

Determining the optimal initial dose for warfarin is a critically important task. Several factors have an impact on the therapeutic dose for individual patients, such as patients' physical attributes (Age, Height, etc.), medication profile, co-morbidities, and metabolic genotypes (CYP2C9 and VKORC1). These wide range factors influencing therapeutic dose, create a complex environment for clinicians to determine the optimal initial dose. Using a sample of 4,237 patients, we have proposed a companion classification model to one of the most popular dosing algorithms (International Warfarin Pharmacogenetics Consortium (IWPC) clinical model), which identifies the appropriate cohort of patients for applying this model. The proposed model functions as a clinical decision support system which assists clinicians in dosing. We have developed a classification model using Support Vector Machines, with a polynomial kernel function to determine if applying the dose prediction model is appropriate for a given patient. The IWPC clinical model will only be used if the patient is classified as "Safe for model". By using the proposed methodology, the dosing mode's prediction accuracy increases by 15 percent in terms of Root Mean Squared Error and 17 percent in terms of Mean Absolute Error in dose estimates of patients classified as "Safe for model".


Prescriptive Cluster-Dependent Support Vector Machines with an Application to Reducing Hospital Readmissions

arXiv.org Machine Learning

We augment linear Support Vector Machine (SVM) classifiers by adding three important features: (i) we introduce a regularization constraint to induce a sparse classifier; (ii) we devise a method that partitions the positive class into clusters and selects a sparse SVM classifier for each cluster; and (iii) we develop a method to optimize the values of controllable variables in order to reduce the number of data points which are predicted to have an undesirable outcome, which, in our setting, coincides with being in the positive class. The latter feature leads to personalized prescriptions/recommendations. We apply our methods to the problem of predicting and preventing hospital readmissions within 30-days from discharge for patients that underwent a general surgical procedure. To that end, we leverage a large dataset containing over 2.28 million patients who had surgeries in the period 2011--2014 in the U.S. The dataset has been collected as part of the American College of Surgeons National Surgical Quality Improvement Program (NSQIP).


Detecting Activities of Daily Living and Routine Behaviours in Dementia Patients Living Alone Using Smart Meter Load Disaggregation

arXiv.org Machine Learning

The emergence of an ageing population is a significant public health concern. This has led to an increase in the number of people living with progressive neurodegenerative disorders like dementia. Consequently, the strain this is places on health and social care services means providing 24-hour monitoring is not sustainable. Technological intervention is being considered, however no solution exists to non-intrusively monitor the independent living needs of patients with dementia. As a result many patients hit crisis point before intervention and support is provided. In parallel, patient care relies on feedback from informal carers about significant behavioural changes. Yet, not all people have a social support network and early intervention in dementia care is often missed. The smart meter rollout has the potential to change this. Using machine learning and signal processing techniques, a home energy supply can be disaggregated to detect which home appliances are turned on and off. This will allow Activities of Daily Living (ADLs) to be assessed, such as eating and drinking, and observed changes in routine to be detected for early intervention. The primary aim is to help reduce deterioration and enable patients to stay in their homes for longer. A Support Vector Machine (SVM) and Random Decision Forest classifier are modelled using data from three test homes. The trained models are then used to monitor two patients with dementia during a six-month clinical trial undertaken in partnership with Mersey Care NHS Foundation Trust. In the case of load disaggregation for appliance detection, the SVM achieved (AUC=0.86074, Sen=0.756 and Spec=0.92838). While the Decision Forest achieved (AUC=0.9429, Sen=0.9634 and Spec=0.9634). ADLs are also analysed to identify the behavioural patterns of the occupant while detecting alterations in routine.


Emotion Recognition with Machine Learning Using EEG Signals

arXiv.org Machine Learning

In this research, an emotion recognition system is developed based on valence/arousal model using electroencephalography (EEG) signals. EEG signals are decomposed into the gamma, beta, alpha and theta frequency bands using discrete wavelet transform (DWT), and spectral features are extracted from each frequency band. Principle component analysis (PCA) is applied to the extracted features by preserving the same dimensionality, as a transform, to make the features mutually uncorrelated. Support vector machine (SVM), K-nearest neighbor (KNN) and artificial neural network (ANN) are used to classify emotional states. The cross-validated SVM with radial basis function (RBF) kernel using extracted features of 10 EEG channels, performs with 91.3% accuracy for arousal and 91.1% accuracy for valence, both in the beta frequency band. Our approach shows better performance compared to existing algorithms applied to the "DEAP" dataset.


Natural language processing helps hospital predict downstream demand for imaging services

#artificialintelligence

The authors said an automated method for predicting future imaging resource utilization could help streamline the process, paving the way for capacity management strategies that could help meet the increased but unpredictable demand for radiology services. Using data from all hepatocellular carcinoma (HCC) surveillance CT exams performed at their hospital between 2010 and 2017, they used open-source NLP and machine learning software to parse free-text radiology reports into bag-of-words and term frequency-inverse document frequency (TF-IDF) models. In NLP, bag-of-words refers to the frequency with which words occur in a report summary, while TF-IDF considers the number of times a word appears in the summary and measures the uniqueness of specific terms in the context of entire report collections. Brown and Kachura also used three machine learning techniques--logistic regression, support vector machine (SVM) and random forest--to make their predictions. As a whole, the authors found bag-of-words models were somewhat inferior to the TF-IDF approach, with the TF-IDF and SVM combination yielding the most favorable results.


Multi-Stage Fault Warning for Large Electric Grids Using Anomaly Detection and Machine Learning

arXiv.org Machine Learning

In the monitoring of a complex electric grid, it is of paramount importance to provide operators with early warnings of anomalies detected on the network, along with a precise classification and diagnosis of the specific fault type. In this paper, we propose a novel multi-stage early warning system prototype for electric grid fault detection, classification, subgroup discovery, and visualization. In the first stage, a computationally efficient anomaly detection method based on quartiles detects the presence of a fault in real time. In the second stage, the fault is classified into one of nine pre-defined disaster scenarios. The time series data are first mapped to highly discriminative features by applying dimensionality reduction based on temporal autocorrelation. The features are then mapped through one of three classification techniques: support vector machine, random forest, and artificial neural network. Finally in the third stage, intra-class clustering based on dynamic time warping is used to characterize the fault with further granularity. Results on the Bonneville Power Administration electric grid data show that i) the proposed anomaly detector is both fast and accurate; ii) dimensionality reduction leads to dramatic improvement in classification accuracy and speed; iii) the random forest method offers the most accurate, consistent, and robust fault classification; and iv) time series within a given class naturally separate into five distinct clusters which correspond closely to the geographical distribution of electric grid buses.


Derisking machine learning and artificial intelligence

#artificialintelligence

Machine learning and artificial intelligence are set to transform the banking industry, using vast amounts of data to build models that improve decision making, tailor services, and improve risk management. According to the McKinsey Global Institute, this could generate value of more than $250 billion in the banking industry.1 1.For the purposes of this article machine learning is broadly defined to include algorithms that learn from data without being explicitly programmed, including, for example, random forests, boosted decision trees, support-vector machines, deep learning, and reinforcement learning. The definition includes both supervised and unsupervised algorithms. For a full primer on the applications of artificial intelligence, we refer the reader to "An executive's guide to AI." But there is a downside, since machine-learning models amplify some elements of model risk.


Shapley regressions: A framework for statistical inference on machine learning models

arXiv.org Machine Learning

Machine learning models often excel in the accuracy of their predictions but are opaque due to their non-linear and non-parametric structure. This makes statistical inference challenging and disqualifies them from many applications where model interpretability is crucial. This paper proposes the Shapley regression framework as an approach for statistical inference on non-linear or non-parametric models. Inference is performed based on the Shapley value decomposition of a model, a pay-off concept from cooperative game theory. I show that universal approximators from machine learning are estimation consistent and introduce hypothesis tests for individual variable contributions, model bias and parametric functional forms. The inference properties of state-of-the-art machine learning models - like artificial neural networks, support vector machines and random forests - are investigated using numerical simulations and real-world data. The proposed framework is unique in the sense that it is identical to the conventional case of statistical inference on a linear model if the model is linear in parameters. This makes it a well-motivated extension to more general models and strengthens the case for the use of machine learning to inform decisions.


Cause Identification of Electromagnetic Transient Events using Spatiotemporal Feature Learning

arXiv.org Machine Learning

This paper presents a spatiotemporal unsupervised feature learning method for cause identification of electromagnetic transient events (EMTE) in power grids. The proposed method is formulated based on the availability of time-synchronized high-frequency measurement, and using the convolutional neural network (CNN) as the spatiotemporal feature representation along with softmax function. Despite the existing threshold-based, or energy-based events analysis methods, such as support vector machine (SVM), autoencoder, and tapered multi-layer perception (t-MLP) neural network, the proposed feature learning is carried out with respect to both time and space. The effectiveness of the proposed feature learning and the subsequent cause identification is validated through the EMTP simulation of different events such as line energization, capacitor bank energization, lightning, fault, and high-impedance fault in the IEEE 30-bus, and the real-time digital simulation (RTDS) of the WSCC 9-bus system.


A tractable ellipsoidal approximation for voltage regulation problems

arXiv.org Machine Learning

We present a machine learning approach to the solution of chance constrained optimizations in the context of voltage regulation problems in power system operation. The novelty of our approach resides in approximating the feasible region of uncertainty with an ellipsoid. We formulate this problem using a learning model similar to Support Vector Machines (SVM) and propose a sampling algorithm that efficiently trains the model. We demonstrate our approach on a voltage regulation problem using standard IEEE distribution test feeders.