Support Vector Machines


How to Develop a Face Recognition System Using FaceNet in Keras

#artificialintelligence

Face recognition is a computer vision task of identifying and verifying a person based on a photograph of their face. FaceNet is a face recognition system developed in 2015 by researchers at Google that achieved then state-of-the-art results on a range of face recognition benchmark datasets. The FaceNet system can be used broadly thanks to multiple third-party open source implementations of the model and the availability of pre-trained models. The FaceNet system can be used to extract high-quality features from faces, called face embeddings, that can then be used to train a face identification system. In this tutorial, you will discover how to develop a face detection system using FaceNet and an SVM classifier to identify people from photographs. How to Develop a Face Recognition System Using FaceNet in Keras and an SVM Classifier Photo by Peter Valverde, some rights reserved. Face recognition is the general task of identifying and verifying people from photographs of their face.


Proof-of-concept system uses smart speakers to catch signs of cardiac arrest

#artificialintelligence

In an effort to tackle in-home cardiac arrest, University of Washington researchers have devised a novel contactless system that uses smartphones or voice-based personal assistants to identify telltale breathing patterns that accompany an attack. The proof-of-concept strategy, described in an NPJ Digital Medicine paper published this morning, involved a supervised machine learning model called a support-vector machine that was trained for use in the bedroom, a controlled environment in which the majority of in-home cardiac arrests occur. "Sometimes reported as'gasping' breaths, agonal respirations may hold potential as an audible diagnostic biomarker, particularly in unwitnessed cardiac arrests that occur in a private residence, the location of [two-thirds] of all [out-of-hospital cardiac arrests]," the researchers wrote. "The widespread adoption of smartphones and smart speakers (projected to be in 75% of US households by 2020) presents a unique opportunity to identify this audible biomarker and connect unwitnessed cardiac arrest victims to emergency medical services (EMS) or others who can administer cardiopulmonary resuscitation." Cross-validation analysis of the trained classifier yielded an overall sensitivity and specificity of 97.24% and 99.51%.


Machine Learning Classification with Python for Direct Marketing

#artificialintelligence

How to make business more time-efficient, slash costs and drive up sales? The question is timeless but not rhetorical. In the next few minutes of your reading time, I will apply a few classification algorithms to demonstrate how the use of the data analytic approach can contribute to that end. Together we'll create a predictive model that will help us customise the client databases we hand over to the telemarketing team so that they could concentrate resources on more promising clients first. On course to that, we'll perform a number of actions on the dataset.


Persistent homology detects curvature

arXiv.org Machine Learning

In topological data analysis, persistent homology is used to study the "shape of data". Persistent homology computations are completely characterized by a set of intervals called a bar code. It is often said that the long intervals represent the "topological signal" and the short intervals represent "noise". We give evidence to dispute this thesis, showing that the short intervals encode geometric information. Specifically, we prove that persistent homology detects the curvature of disks from which points have been sampled. We describe a general computational framework for solving inverse problems using the average persistence landscape, a continuous mapping from metric spaces with a probability measure to a Hilbert space. In the present application, the average persistence landscapes of points sampled from disks of constant curvature results in a path in this Hilbert space which may be learned using standard tools from statistical and machine learning.


k-Nearest Neighbor Optimization via Randomized Hyperstructure Convex Hull

arXiv.org Machine Learning

In the k-nearest neighbor algorithm (k-NN), the determination of classes for test instances is usually performed via a majority vote system, which may ignore the similarities among data. In this research, the researcher proposes an approach to fine-tune the selection of neighbors to be passed to the majority vote system through the construction of a random n-dimensional hyperstructure around the test instance by introducing a new threshold parameter. The accuracy of the proposed k-NN algorithm is 85.71%, while the accuracy of the conventional k-NN algorithm is 80.95% when performed on the Haberman's Cancer Survival dataset, and 94.44% for the proposed k-NN algorithm, compared to the conventional's 88.89% accuracy score on the Seeds dataset. The proposed k-NN algorithm is also on par with the conventional support vector machine algorithm accuracy, even on the Banknote Authentication and Iris datasets, even surpassing the accuracy of support vector machine on the Seeds dataset.


Medium-Term Load Forecasting Using Support Vector Regression, Feature Selection, and Symbiotic Organism Search Optimization

arXiv.org Machine Learning

An accurate load forecasting has always been one of the main indispensable parts in the operation and planning of power systems. Among different time horizons of forecasting, while short-term load forecasting (STLF) and long-term load forecasting (LTLF) have respectively got benefits of accurate predictors and probabilistic forecasting, medium-term load forecasting (MTLF) demands more attention due to its vital role in power system operation and planning such as optimal scheduling of generation units, robust planning program for customer service, and economic supply. In this study, a hybrid method, composed of Support Vector Regression (SVR) and Symbiotic Organism Search Optimization (SOSO) method, is proposed for MTLF. In the proposed forecasting model, SVR is the main part of the forecasting algorithm while SOSO is embedded into it to optimize the parameters of SVR. In addition, a minimum redundancy-maximum relevance feature selection algorithm is used to in the preprocessing of input data. The proposed method is tested on EUNITE competition dataset to demonstrate its proper performance. Furthermore, it is compared with some previous works to show eligibility of our method.


Automatically Evaluating Balance: A Machine Learning Approach

arXiv.org Machine Learning

Compared to in-clinic balance training, in-home training is not as effective. This is, in part, due to the lack of feedback from physical therapists (PTs). Here, we analyze the feasibility of using trunk sway data and machine learning (ML) techniques to automatically evaluate balance, providing accurate assessments outside of the clinic. We recruited sixteen participants to perform standing balance exercises. For each exercise, we recorded trunk sway data and had a PT rate balance performance on a scale of 1 to 5. The rating scale was adapted from the Functional Independence Measure. From the trunk sway data, we extracted a 61-dimensional feature vector representing performance of each exercise. Given these labeled data, we trained a multi-class support vector machine (SVM) to map trunk sway features to PT ratings. Evaluated in a leave-one-participant-out scheme, the model achieved a classification accuracy of 82%. Compared to participant self-assessment ratings, the SVM outputs were significantly closer to PT ratings. The results of this pilot study suggest that in the absence of PTs, ML techniques can provide accurate assessments during standing balance exercises. Such automated assessments could reduce PT consultation time and increase user compliance outside of the clinic.


Selecting Biomarkers for building optimal treatment selection rules using Kernel Machines

arXiv.org Machine Learning

Optimal biomarker combinations for treatment-selection can be derived by minimizing total burden to the population caused by the targeted disease and its treatment. However, when multiple biomarkers are present, including all in the model can be expensive and hurt model performance. To remedy this, we consider feature selection in optimization by minimizing an extended total burden that additionally incorporates biomarker measurement costs. Formulating it as a 0-norm penalized weighted classification, we develop various procedures for estimating linear and nonlinear combinations. Through simulations and a real data example, we demonstrate the importance of incorporating feature-selection and marker cost when deriving treatment-selection rules.


Enumeration of Distinct Support Vectors for Interactive Decision Making

arXiv.org Machine Learning

In conventional prediction tasks, a machine learning algorithm outputs a single best model that globally optimizes its objective function, which typically is accuracy. Therefore, users cannot access the other models explicitly. In contrast to this, multiple model enumeration attracts increasing interests in non-standard machine learning applications where other criteria, e.g., interpretability or fairness, than accuracy are main concern and a user may want to access more than one non-optimal, but suitable models. In this paper, we propose a K-best model enumeration algorithm for Support Vector Machines (SVM) that given a dataset S and an integer K>0, enumerates the K-best models on S with distinct support vectors in the descending order of the objective function values in the dual SVM problem. Based on analysis of the lattice structure of support vectors, our algorithm efficiently finds the next best model with small latency. This is useful in supporting users's interactive examination of their requirements on enumerated models. By experiments on real datasets, we evaluated the efficiency and usefulness of our algorithm.


Higher-Order Accelerated Methods for Faster Non-Smooth Optimization

arXiv.org Machine Learning

We provide improved convergence rates for various \emph{non-smooth} optimization problems via higher-order accelerated methods. In the case of $\ell_\infty$ regression, we achieves an $O(\epsilon^{-4/5})$ iteration complexity, breaking the $O(\epsilon^{-1})$ barrier so far present for previous methods. We arrive at a similar rate for the problem of $\ell_1$-SVM, going beyond what is attainable by first-order methods with prox-oracle access for non-smooth non-strongly convex problems. We further show how to achieve even faster rates by introducing higher-order regularization. Our results rely on recent advances in near-optimal accelerated methods for higher-order smooth convex optimization. In particular, we extend Nesterov's smoothing technique to show that the standard softmax approximation is not only smooth in the usual sense, but also \emph{higher-order} smooth. With this observation in hand, we provide the first example of higher-order acceleration techniques yielding faster rates for \emph{non-smooth} optimization, to the best of our knowledge.