Goto

Collaborating Authors

 Support Vector Machines


Active Learning of SVDD Hyperparameter Values

arXiv.org Machine Learning

Support Vector Data Description is a popular method for outlier detection. However, its usefulness largely depends on selecting good hyperparameter values -- a difficult problem that has received significant attention in literature. Existing methods to estimate hyperparameter values are purely heuristic, and the conditions under which they work well are unclear. In this article, we propose LAMA (Local Active Min-Max Alignment), the first principled approach to estimate SVDD hyperparameter values by active learning. The core idea bases on kernel alignment, which we adapt to active learning with small sample sizes. In contrast to many existing approaches, LAMA provides estimates for both SVDD hyperparameters. These estimates are evidence-based, i.e., rely on actual class labels, and come with a quality score. This eliminates the need for manual validation, an issue with current heuristics. LAMA outperforms state-of-the-art competitors in extensive experiments on real-world data. In several cases, LAMA even yields results close to the empirical upper bound.


Text Classification in Python

#artificialintelligence

Overall, we obtain really good accuracy values for every model. We can observe that the Gradient Boosting, Logistic Regression and Random Forest models seem to overfit since they have an extremely high training set accuracy but a lower test set accuracy, so we'll discard them. We will choose the SVM classifier above the remaining models because it has the highest test set accuracy, which is really near to the training set accuracy.


On Classifying Sepsis Heterogeneity in the ICU: Insight Using Machine Learning

arXiv.org Machine Learning

Tel: 44 (0) 758 285 9501 Mesh Term s Machine Learning, Sepsis Syndrome Word Count: 2,000 AB STRACT Current machine learning models aiming to predict sepsis from Electronic Health Records (EHR) do not account for the heterogeneity of the condition, despite its e me r ging importance in prognosis and treatment. This work demonstrate s the added value of stratifying the types of organ dysfunction observed in patients who develop sepsis in the ICU in improving the ability to recognise patients at risk of sepsis from their EHR data. Using an ICU dataset of 13,728 records, we identify clinically significant sepsis subpopulations with distinct organ dysfunction patterns . Classification experiments using Random Forest, Gradient Boost Trees and Support Vector Machines, aiming to distinguish patients who develop sepsis in the ICU from those who do not, show that features selected using sepsis subpopulations as background knowledge yield a superior performance regardless of the classification model used. Our findings can steer mach ine learning efforts towards more personalised models for complex conditions including sepsis.


The relationship between trust in AI and trustworthy machine learning technologies

arXiv.org Artificial Intelligence

To build AI-based systems that users and the public can justifiably trust one needs to understand how machine learning technologies impact trust put in these services. To guide technology developments, this paper provides a systematic approach to relate social science concepts of trust with the technologies used in AI-based services and products. We conceive trust as discussed in the ABI (Ability, Benevolence, Integrity) framework and use a recently proposed mapping of ABI on qualities of technologies. We consider four categories of machine learning technologies, namely these for Fairness, Explainability, Auditability and Safety (FEAS) and discuss if and how these possess the required qualities. Trust can be impacted throughout the life cycle of AI-based systems, and we introduce the concept of Chain of Trust to discuss technological needs for trust in different stages of the life cycle. FEAS has obvious relations with known frameworks and therefore we relate FEAS to a variety of international Principled AI policy and technology frameworks that have emerged in recent years.


Location Forensics of Media Recordings Utilizing Cascaded SVM and Pole-matching Classifiers

arXiv.org Machine Learning

Information regarding the location of power distribution grid can be extracted from the power signature embedded in the multimedia signals (e.g., audio, video data) recorded near electrical activities. This implicit mechanism of identifying the origin-of-recording can be a very promising tool for multimedia forensics and security applications. In this work, we have developed a novel grid-of-origin identification system from media recording that consists of a number of support vector machine (SVM) followed by pole-matching (PM) classifiers. First, we determine the nominal frequency of the grid (50 or 60 Hz) based on the spectral observation. Then an SVM classifier, trained for the detection of a grid with a particular nominal frequency, narrows down the list of possible grids on the basis of di ff erent discriminating features extracted from the electric network frequency (ENF) signal. The decision of the SVM classifier is then passed to the PM classifier that detects the final grid based on the minimum distance between the estimated poles of test and training grids. Thus, we start from the problem of classifying grids with di fferent nominal frequencies and simplify the problem of classification in three stages based on nominal frequency, SVM and finally using PM classifier. This cascaded system of classification ensures better accuracy (15 .57% Keywords: Location forensics, ENF, nominal frequency, SVM, AR model, pole-matching classifier. 1. Introduction With the proliferation of terrorism, child pornography [1] or abuse on women, location forensics has become an important area of research in the 21 Success in identifying such locations properly can ease the process of getting hold of the criminals involved.


Machine learning applications in time series hierarchical forecasting

arXiv.org Machine Learning

Hierarchical forecasting (HF) is needed in many situations in the supply chain (SC) because managers often need different levels of forecasts at different levels of SC to make a decision. Top-Down (TD), Bottom-Up (BU) and Optimal Combination (COM) are common HF models. These approaches are static and often ignore the dynamics of the series while disaggregating them. Consequently, they may fail to perform well if the investigated group of time series are subject to large changes such as during the periods of promotional sales. We address the HF problem of predicting real-world sales time series that are highly impacted by promotion. We use three machine learning (ML) models to capture sales variations over time. Artificial neural networks (ANN), extreme gradient boosting (XGboost), and support vector regression (SVR) algorithms are used to estimate the proportions of lower-level time series from the upper level. We perform an in-depth analysis of 61 groups of time series with different volatilities and show that ML models are competitive and outperform some well-established models in the literature.


Predominant Musical Instrument Classification based on Spectral Features

arXiv.org Machine Learning

This work aims to examine one of the cornerstone problems of Musical Instrument Recognition, in particular instrument classification. IRMAS (Instrument recognition in Musical Audio Signals) data set is chosen. The data includes music obtained from various decades in the last century, thus having a wide variety in audio quality. We have presented a very concise summary of past work in this domain. Having implemented various supervised learning algorithms for this classification task, SVM classifier has outperformed the other state-of-the-art models with an accuracy of 79%. The classifier had a major challenge distinguishing between flute and organ. We also implemented Unsupervised techniques out of which Hierarchical Clustering has performed well. We have included most of the code (jupyter notebook) for easy reproducibility.


Game Theory in Artificial Intelligence

#artificialintelligence

Game Theory is a branch of mathematics used to model the strategic interaction between different players in a context with predefined rules and outcomes. Game Theory can also be used to describe many situations in our daily life and Machine Learning models (Figure 1). For example, a Classification algorithm such as SVM (Support Vector Machines) can be explained in terms of a two-player game in which one player is challenging the other to find the best hyper-plane giving him the most difficult points to classify. The game will then converge to a solution which will be a trade-off between the strategic abilities of the two players (eg. Different aspects of Game Theory are commonly used in Artificial Intelligence, I will now introduce you to the Nash Equilibrium, Inverse Game Theory and give you some practical examples.


Communication-Efficient Distributed Online Learning with Kernels

arXiv.org Machine Learning

We propose an efficient distributed online learning protocol for low-latency real-time services. It extends a previously presented protocol to kernelized online learners that represent their models by a support vector expansion. While such learners often achieve higher predictive performance than their linear counterparts, communicating the support vector expansions becomes inefficient for large numbers of support vectors. The proposed extension allows for a larger class of online learning algorithms---including those alleviating the problem above through model compression. In addition, we characterize the quality of the proposed protocol by introducing a novel criterion that requires the communication to be bounded by the loss suffered.


Distributed estimation of principal support vector machines for sufficient dimension reduction

arXiv.org Machine Learning

The principal support vector machines method (Li et al., 2011) is a powerful tool for sufficient dimension reduction that replaces original predictors with their low-dimensional linear combinations without loss of information. However, the computational burden of the principal support vector machines method constrains its use for massive data. To address this issue, we in this paper propose two distributed estimation algorithms for fast implementation when the sample size is large. Both the two distributed sufficient dimension reduction estimators enjoy the same statistical efficiency as merging all the data together, which provides rigorous statistical guarantees for their application to large scale datasets. The two distributed algorithms are further adapt to principal weighted support vector machines (Shin et al., 2016) for sufficient dimension reduction in binary classification. The statistical accuracy and computational complexity of our proposed methods are examined through comprehensive simulation studies and a real data application with more than 600000 samples.