Goto

Collaborating Authors

 Support Vector Machines


GLocalX -- From Local to Global Explanations of Black Box AI Models

arXiv.org Artificial Intelligence

Artificial Intelligence (AI) has come to prominence as one of the major components of our society, with applications in most aspects of our lives. In this field, complex and highly nonlinear machine learning models such as ensemble models, deep neural networks, and Support Vector Machines have consistently shown remarkable accuracy in solving complex tasks. Although accurate, AI models often are "black boxes" which we are not able to understand. Relying on these models has a multifaceted impact and raises significant concerns about their transparency. Applications in sensitive and critical domains are a strong motivational factor in trying to understand the behavior of black boxes. We propose to address this issue by providing an interpretable layer on top of black box models by aggregating "local" explanations. We present GLocalX, a "local-first" model agnostic explanation method. Starting from local explanations expressed in form of local decision rules, GLocalX iteratively generalizes them into global explanations by hierarchically aggregating them. Our goal is to learn accurate yet simple interpretable models to emulate the given black box, and, if possible, replace it entirely. We validate GLocalX in a set of experiments in standard and constrained settings with limited or no access to either data or local explanations. Experiments show that GLocalX is able to accurately emulate several models with simple and small models, reaching state-of-the-art performance against natively global solutions. Our findings show how it is often possible to achieve a high level of both accuracy and comprehensibility of classification models, even in complex domains with high-dimensional data, without necessarily trading one property for the other. This is a key requirement for a trustworthy AI, necessary for adoption in high-stakes decision making applications.


Utilizing Import Vector Machines to Identify Dangerous Pro-active Traffic Conditions

arXiv.org Machine Learning

Traffic accidents have been a severe issue in metropolises with the development of traffic flow. This paper explores the theory and application of a recently developed machine learning technique, namely Import Vector Machines (IVMs), in real-time crash risk analysis, which is a hot topic to reduce traffic accidents. Historical crash data and corresponding traffic data from Shanghai Urban Expressway System were employed and matched. Traffic conditions are labelled as dangerous (i.e. probably leading to a crash) and safe (i.e. a normal traffic condition) based on 5-minute measurements of average speed, volume and occupancy. The IVM algorithm is trained to build the classifier and its performance is compared to the popular and successfully applied technique of Support Vector Machines (SVMs). The main findings indicate that IVMs could successfully be employed in real-time identification of dangerous pro-active traffic conditions. Furthermore, similar to the "support points" of the SVM, the IVM model uses only a fraction of the training data to index kernel basis functions, typically a much smaller fraction than the SVM, and its classification rates are similar to those of SVMs. This gives the IVM a computational advantage over the SVM, especially when the size of the training data set is large.


Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities

arXiv.org Artificial Intelligence

The emergence and continued reliance on the Internet and related technologies has resulted in the generation of large amounts of data that can be made available for analyses. However, humans do not possess the cognitive capabilities to understand such large amounts of data. Machine learning (ML) provides a mechanism for humans to process large amounts of data, gain insights about the behavior of the data, and make more informed decision based on the resulting analysis. ML has applications in various fields. This review focuses on some of the fields and applications such as education, healthcare, network security, banking and finance, and social media. Within these fields, there are multiple unique challenges that exist. However, ML can provide solutions to these challenges, as well as create further research opportunities. Accordingly, this work surveys some of the challenges facing the aforementioned fields and presents some of the previous literature works that tackled them. Moreover, it suggests several research opportunities that benefit from the use of ML to address these challenges.


Unobtrusive Pain Monitoring in Older Adults with Dementia using Pairwise and Contrastive Training

arXiv.org Artificial Intelligence

Although pain is frequent in old age, older adults are often undertreated for pain. This is especially the case for longterm care residents with moderate to severe dementia who cannot report their pain because of cognitive impairments that accompany dementia. Nursing staff acknowledge the challenges of effectively recognizing and managing pain in long-term care facilities due to lack of human resources and, sometimes, expertise to use validated pain assessment approaches on a regular basis. Vision-based ambient monitoring will allow for frequent automated assessments so care staff could be automatically notified when signs of pain are displayed. However, existing computer vision techniques for pain detection are not validated on faces of older adults or people with dementia, and this population is not represented in existing facial expression datasets of pain. We present the first fully automated vision-based technique validated on a dementia cohort. First, we develop a deep learning-based computer vision system for detecting painful facial expressions on a video dataset that is collected unobtrusively from older adult participants with and without dementia. Second, we introduce a pairwise comparative inference method that calibrates to each person and is sensitive to changes in facial expression while using training data more efficiently than sequence models. Third, we introduce a fast contrastive training method that improves cross-dataset performance. Our pain estimation model outperforms baselines by a wide margin, especially when evaluated on faces of people with dementia. A. Motivation Pain is common and frequent in old age [1], but older adults are often underdiagnosed and undertreated for pain [1], [2]. This problem is especially serious for people with dementia who are often unable to verbally express or otherwise communicate their experience due to cognitive impairment [3]. Effective and validated assessment approaches, based on observation of nonverbal pain behaviours - e.g. Untreated pain can have serious physical (e.g. The motivation for this work is to develop an ambient monitoring technology to reliably, automatically, and consistently assess pain in order to improve pain management in LTC and to ultimately provide a better quality of care to older adults. B. Clinically Valid Assessments of Pain in Dementia Two clinically validated metrics for assessing pain in dementia exist: 1) the Prkachin and Solomon Pain Index (PSPI) [6] and 2) the Pain Assessment Checklist for Seniors with Limited Ability to Communicate-II (PACSLAC-II) [7]. PSPI is a 16-point metric, and is based on the Facial Action Coding System (FACS) [8].


SHARKS: Smart Hacking Approaches for RisK Scanning in Internet-of-Things and Cyber-Physical Systems based on Machine Learning

arXiv.org Artificial Intelligence

Cyber-physical systems (CPS) and Internet-of-Things (IoT) devices are increasingly being deployed across multiple functionalities, ranging from healthcare devices and wearables to critical infrastructures, e.g., nuclear power plants, autonomous vehicles, smart cities, and smart homes. These devices are inherently not secure across their comprehensive software, hardware, and network stacks, thus presenting a large attack surface that can be exploited by hackers. In this article, we present an innovative technique for detecting unknown system vulnerabilities, managing these vulnerabilities, and improving incident response when such vulnerabilities are exploited. The novelty of this approach lies in extracting intelligence from known real-world CPS/IoT attacks, representing them in the form of regular expressions, and employing machine learning (ML) techniques on this ensemble of regular expressions to generate new attack vectors and security vulnerabilities. Our results show that 10 new attack vectors and 122 new vulnerability exploits can be successfully generated that have the potential to exploit a CPS or an IoT ecosystem. The ML methodology achieves an accuracy of 97.4% and enables us to predict these attacks efficiently with an 87.2% reduction in the search space. We demonstrate the application of our method to the hacking of the in-vehicle network of a connected car. To defend against the known attacks and possible novel exploits, we discuss a defense-in-depth mechanism for various classes of attacks and the classification of data targeted by such attacks. This defense mechanism optimizes the cost of security measures based on the sensitivity of the protected resource, thus incentivizing its adoption in real-world CPS/IoT by cybersecurity practitioners.


Learning Sign-Constrained Support Vector Machines

arXiv.org Artificial Intelligence

Domain knowledge is useful to improve the generalization performance of learning machines. Sign constraints are a handy representation to combine domain knowledge with learning machine. In this paper, we consider constraining the signs of the weight coefficients in learning the linear support vector machine, and develop two optimization algorithms for minimizing the empirical risk under the sign constraints. One of the two algorithms is based on the projected gradient method, in which each iteration of the projected gradient method takes $O(nd)$ computational cost and the sublinear convergence of the objective error is guaranteed. The second algorithm is based on the Frank-Wolfe method that also converges sublinearly and possesses a clear termination criterion. We show that each iteration of the Frank-Wolfe also requires $O(nd)$ cost. Furthermore, we derive the explicit expression for the minimal iteration number to ensure an $\epsilon$-accurate solution by analyzing the curvature of the objective function. Finally, we empirically demonstrate that the sign constraints are a promising technique when similarities to the training examples compose the feature vector.


Support Vector Machine and YOLO for a Mobile Food Grading System

arXiv.org Artificial Intelligence

Food quality and safety are of great concern to society since it is an essential guarantee not only for human health but also for social development, and stability. Ensuring food quality and safety is a complex process. All food processing stages should be considered, from cultivating, harvesting and storage to preparation and consumption. Grading is one of the essential processes to control food quality. This paper proposed a mobile visual-based system to evaluate food grading. Specifically, the proposed system acquires images of bananas when they are on moving conveyors. A two-layer image processing system based on machine learning is used to grade bananas, and these two layers are allocated on edge devices and cloud servers, respectively. Support Vector Machine (SVM) is the first layer to classify bananas based on an extracted feature vector composed of color and texture features. Then, the a You Only Look Once (YOLO) v3 model further locating the peel's defected area and determining if the inputs belong to the mid-ripened or well-ripened class. According to experimental results, the first layer's performance achieved an accuracy of 98.5% while the accuracy of the second layer is 85.7%, and the overall accuracy is 96.4%.


Provably Training Neural Network Classifiers under Fairness Constraints

arXiv.org Machine Learning

Training a classifier under fairness constraints has gotten increasing attention in the machine learning community thanks to moral, legal, and business reasons. However, several recent works addressing algorithmic fairness have only focused on simple models such as logistic regression or support vector machines due to non-convex and non-differentiable fairness criteria across protected groups, such as race or gender. Neural networks, the most widely used models for classification nowadays, are precluded and lack theoretical guarantees. This paper aims to fill this missing but crucial part of the literature of algorithmic fairness for neural networks. In particular, we show that overparametrized neural networks could meet the fairness constraints. The key ingredient of building a fair neural network classifier is establishing no-regret analysis for neural networks in the overparameterization regime, which may be of independent interest in the online learning of neural networks and related applications.


Equipment Failure Analysis for Oil and Gas Industry with an Ensemble Predictive Model

arXiv.org Artificial Intelligence

This paper aims at improving the classification accuracy of a Support Vector Machine (SVM) classifier with Sequential Minimal Optimization (SMO) training algorithm in order to properly classify failure and normal instances from oil and gas equipment data. Recent applications of failure analysis have made use of the SVM technique without implementing SMO training algorithm, while in our study we show that the proposed solution can perform much better when using the SMO training algorithm. Furthermore, we implement the ensemble approach, which is a hybrid rule based and neural network classifier to improve the performance of the SVM classifier (with SMO training algorithm). The optimization study is as a result of the underperformance of the classifier when dealing with imbalanced dataset. The selected best performing classifiers are combined together with SVM classifier (with SMO training algorithm) by using the stacking ensemble method which is to create an efficient ensemble predictive model that can handle the issue of imbalanced data. The classification performance of this predictive model is considerably better than the SVM with and without SMO training algorithm and many other conventional classifiers.


Predicting seasonal influenza using supermarket retail records

arXiv.org Artificial Intelligence

Increased availability of epidemiological data, novel digital data streams, and the rise of powerful machine learning approaches have generated a surge of research activity on real-time epidemic forecast systems. In this paper, we propose the use of a novel data source, namely retail market data to improve seasonal influenza forecasting. Specifically, we consider supermarket retail data as a proxy signal for influenza, through the identification of sentinel baskets, i.e., products bought together by a population of selected customers. We develop a nowcasting and forecasting framework that provides estimates for influenza incidence in Italy up to 4 weeks ahead. We make use of the Support Vector Regression (SVR) model to produce the predictions of seasonal flu incidence. Our predictions outperform both a baseline autoregressive model and a second baseline based on product purchases. The results show quantitatively the value of incorporating retail market data in forecasting models, acting as a proxy that can be used for the real-time analysis of epidemics.