Accuracy
Counterfactual Explanation Algorithms for Behavioral and Textual Data
Ramon, Yanou, Martens, David, Provost, Foster, Evgeniou, Theodoros
We study the interpretability of predictive systems that use high-dimensonal behavioral and textual data. Examples include predicting product interest based on online browsing data and detecting spam emails or objectionable web content. Recently, counterfactual explanations have been proposed for generating insight into model predictions, which focus on what is relevant to a particular instance. Conducting a complete search to compute counterfactuals is very time-consuming because of the huge dimensionality. To our knowledge, for behavioral and text data, only one model-agnostic heuristic algorithm (SEDC) for finding counterfactual explanations has been proposed in the literature. However, there may be better algorithms for finding counterfactuals quickly. This study aligns the recently proposed Linear Interpretable Model-agnostic Explainer (LIME) and Shapley Additive Explanations (SHAP) with the notion of counterfactual explanations, and empirically benchmarks their effectiveness and efficiency against SEDC using a collection of 13 data sets. Results show that LIME-Counterfactual (LIME-C) and SHAP-Counterfactual (SHAP-C) have low and stable computation times, but mostly, they are less efficient than SEDC. However, for certain instances on certain data sets, SEDC's run time is comparably large. With regard to effectiveness, LIME-C and SHAP-C find reasonable, if not always optimal, counterfactual explanations. SHAP-C, however, seems to have difficulties with highly unbalanced data. Because of its good overall performance, LIME-C seems to be a favorable alternative to SEDC, which failed for some nonlinear models to find counterfactuals because of the particular heuristic search algorithm it uses. A main upshot of this paper is that there is a good deal of room for further research. For example, we propose algorithmic adjustments that are direct upshots of the paper's findings.
Building a machine learning classifier model for diabetes
The Pima Indians of Arizona and Mexico have the highest reported prevalence of diabetes of any population in the world. A small study has been conducted to analyse their medical records to assess if it is possible to predict the onset of diabetes based on diagnostic measures. The dataset is downloaded from Kaggle, where all patients included are females at least 21 years old of Pima Indian heritage. The objective of this project is to build a predictive machine learning model to predict based on diagnostic measurements whether a patient has diabetes. This is a binary (2-class) classification project with supervised learning. Jupyter Notebook (Python) could be used to follow the process below.
Music Style Classification with Compared Methods in XGB and BPNN
Tan, Lifeng, Jin, Cong, Cheng, Zhiyuan, Lv, Xin, Song, Leiyu
--Scientists have used many different classification methods to solve the problem of music classification. But the efficiency of each classification is different. In this paper, we propose two compared methods on the task of music style classification. More specifically, feature extraction for representing timbral texture, rhythmic content and pitch content are proposed. Comparative evaluations on performances of two classifiers were conducted for music classification with different composers' styles.
Asymptotic Normality and Variance Estimation For Supervised Ensembles
Zhou, Zhengze, Mentch, Lucas, Hooker, Giles
Ensemble methods based on bootstrapping have improved the predictive accuracy of base learners, but fail to provide a framework in which formal statistical inference can be conducted. Recent theoretical developments suggest taking subsamples without replacement and analyze the resulting estimator in the context of a U-statistic, thus demonstrating asymptotic normality properties. However, we observe that current methods for variance estimation exhibit severe bias when the number of base learners is not large enough, compromising the validity of the resulting confidence intervals or hypothesis tests. This paper shows that similar asymptotics can be achieved by means of V-statistics, corresponding to taking subsamples with replacement. Further, we develop a bias correction algorithm for estimating variance in the limiting distribution, which yields satisfactory results with moderate size of base learners.
A gray-box model for a probabilistic estimate of regional ground magnetic perturbations: Enhancing the NOAA operational Geospace model with machine learning
Camporeale, Enrico, Cash, M. D., Singer, H. J., Balch, C. C., Huang, Z., Toth, G.
We present a novel algorithm that predicts the probability that time derivative of the horizontal component of the ground magnetic field $dB/dt$ exceeds a specified threshold at a given location. This quantity provides important information that is physically relevant to Geomagnetically Induced Currents (GIC), which are electric currents induced by sudden changes of the Earth's magnetic field due to Space Weather events. The model follows a 'gray-box' approach by combining the output of a physics-based model with a machine learning approach. Specifically, we use the University of Michigan's Geospace model, that is operational at the NOAA Space Weather Prediction Center, with a boosted ensemble of classification trees. We discuss in detail the issue of combining a large dataset of ground-based measurements ($\sim$ 20 years) with a limited set of simulation runs ($\sim$ 2 years) by developing a surrogate model for the years in which simulation runs are not available. We also discuss the problem of re-calibrating the output of the decision tree to obtain reliable probabilities. The performance of the model is assessed by typical metrics for probabilistic forecasts: Probability of Detection and False Detection, True Skill Score, Heidke Skill Score, and Receiver Operating Characteristic curve.
AP-Perf: Incorporating Generic Performance Metrics in Differentiable Learning
Fathony, Rizal, Kolter, J. Zico
We propose a method that enables practitioners to conveniently incorporate custom non-decomposable performance metrics into differentiable learning pipelines, notably those based upon deep learning architectures. Our approach is based on the recently-developed adversarial prediction framework, a distributionally robust approach that optimizes a metric in the worst case given the statistical summary of the empirical distribution. We formulate a marginal distribution technique to reduce the complexity of optimizing the adversarial prediction formulation over a vast range of non-decomposable metrics. We demonstrate how easy it is to write and incorporate complex custom metrics using our provided tool. Finally, we show the effectiveness of our approach for image classification tasks using MNIST and Fashion-MNIST datasets as well as classification task on tabular data using UCI repository and benchmark datasets.
Matrix sketching for supervised classification with imbalanced classes
Falcone, Roberta, Montanari, Angela, Anderlucci, Laura
Matrix sketching is a recently developed data compression technique. An input matrix A is efficiently approximated with a smaller matrix B, so that B preserves most of the properties of A up to some guaranteed approximation ratio. In so doing numerical operations on big data sets become faster. Sketching algorithms generally use random projections to compress the original dataset and this stochastic generation process makes them amenable to statistical analysis. The statistical properties of sketching algorithms have been widely studied in the context of multiple linear regression. In this paper we propose matrix sketching as a tool for rebalancing class sizes in supervised classification with imbalanced classes. It is well-known in fact that class imbalance may lead to poor classification performances especially as far as the minority class is concerned.
On the Delta Method for Uncertainty Approximation in Deep Learning
Nilsen, Geir K., Munthe-Kaas, Antonella Z., Skaug, Hans J., Brun, Morten
The Delta method is a well known procedure used to quantify uncertainty in statistical models. The method has previously been applied in the context of neural networks, but has not reached much popularity in deep learning because of the sheer size of the Hessian matrix. In this paper, we propose a low cost variant of the method based on an approximate eigendecomposition of the positive curvature subspace of the Hessian matrix. The method has a computational complexity of $O(KPN)$ time and $O(KP)$ space, where $K$ is the number of utilized Hessian eigenpairs, $P$ is the number of model parameters and $N$ is the number of training examples. Given that the model is $L_2$-regularized with rate $\lambda/2$, we provide a bound on the uncertainty approximation error given $K$. We show that when the smallest Hessian eigenvalue in the positive $K/2$-tail of the full spectrum, and the largest Hessian eigenvalue in the negative $K/2$-tail of the full spectrum are both approximately equal to $\lambda$, the error will be close to zero even when $K\ll P$ . We demonstrate the method by a TensorFlow implementation, and show that meaningful rankings of images based on prediction uncertainty can be obtained for a convolutional neural network based MNIST classifier. We also observe that false positives have higher prediction uncertainty than true positives. This suggests that there is supplementing information in the uncertainty measure not captured by the probability alone.
Recovering from Biased Data: Can Fairness Constraints Improve Accuracy?
Multiple fairness constraints have been proposed in the literature, motivated by a range of concerns about how demographic groups might be treated unfairly by machine learning classifiers. In this work we consider a different motivation; learning from biased training data. We posit several ways in which training data may be biased, including having a more noisy or negatively biased labeling process on members of a disadvantaged group, or a decreased prevalence of positive or negative examples from the disadvantaged group, or both. Given such biased training data, Empirical Risk Minimization (ERM) may produce a classifier that not only is biased but also has suboptimal accuracy on the true data distribution. We examine the ability of fairness-constrained ERM to correct this problem. In particular, we find that the Equal Opportunity fairness constraint (Hardt, Price, and Srebro 2016) combined with ERM will provably recover the Bayes Optimal Classifier under a range of bias models. We also consider other recovery methods including reweighting the training data, Equalized Odds, and Demographic Parity. These theoretical results provide additional motivation for considering fairness interventions even if an actor cares primarily about accuracy.
Cybersecurity in 2020: More targeted attacks, AI not a prevention panacea
Given the proliferation of high-profile attacks in 2019, the security outlook for next year--and the next decade--is filled with potential pitfalls, as challenges persist in maintaining the security profile in enterprises, particularly as security operations teams are spread thinner as attack surfaces widen. SEE: Special report: The cloud v. data center decision (free PDF) (TechRepublic) McAfee CTO Steve Grobman and Director of Engineering Liz Maida--who joined the company through their acquisition of Uplevel Security, a firm that applied graph theory and machine learning to security data--spoke to TechRepublic about the security forecast for 2020. In contrast to spray-and-pray attacks, relying on port scanning to uncover low-hanging vulnerabilities, an increase in attacks targeting specific industries are anticipated to continue their rise in popularity. "We've seen a good number of ransomware campaigns where the adversaries have done reconnaissance to really understand the critical assets [and] the defenses, and then tailor the attack in order to get into that environment, to demand a higher payment from the victim," Grobman said. "That really requires a much more sophisticated level of defense for the defenders. The other point that I'd make is...we see the evolution of attacks from just focusing on traditional compute environments, to also focusing on cloud environments. Given that many organizations are shifting key components of their operations into the cloud, it would be natural that adversaries are looking for ways to not only target traditional environments, but also cloud assets," Grobman said.