Goto

Collaborating Authors

 Performance Analysis


Adversarial Deep Ensemble: Evasion Attacks and Defenses for Malware Detection

arXiv.org Machine Learning

Malware remains a big threat to cyber security, calling for machine learning based malware detection. While promising, such detectors are known to be vulnerable to evasion attacks. Ensemble learning typically facilitates countermeasures, while attackers can leverage this technique to improve attack effectiveness as well. This motivates us to investigate which kind of robustness the ensemble defense or effectiveness the ensemble attack can achieve, particularly when they combat with each other. We thus propose a new attack approach, named mixture of attacks, by rendering attackers capable of multiple generative methods and multiple manipulation sets, to perturb a malware example without ruining its malicious functionality. This naturally leads to a new instantiation of adversarial training, which is further geared to enhancing the ensemble of deep neural networks. We evaluate defenses using Android malware detectors against 26 different attacks upon two practical datasets. Experimental results show that the new adversarial training significantly enhances the robustness of deep neural networks against a wide range of attacks, ensemble methods promote the robustness when base classifiers are robust enough, and yet ensemble attacks can evade the enhanced malware detectors effectively, even notably downgrading the VirusTotal service.


Deep Probabilistic Accelerated Evaluation: A Certifiable Rare-Event Simulation Methodology for Black-Box Autonomy

arXiv.org Machine Learning

Evaluating the reliability of intelligent physical systems against rare catastrophic events poses a huge testing burden for real-world applications. Simulation provides a useful, if not unique, platform to evaluate the extremal risks of these AIenabled systems before their deployments. Importance Sampling (IS), while proven to be powerful for rare-event simulation, faces challenges in handling these systems due to their black-box nature that fundamentally undermines its efficiency guarantee. To overcome this challenge, we propose a framework called Deep Probabilistic Accelerated Evaluation (D-PrAE) to design IS, which leverages rare-event-set learning and and a new notion of efficiency certificate. D-PrAE combines the dominating point method with deep neural network classifiers to achieve superior estimation efficiency. We present theoretical guarantees and demonstrate the empirical effectiveness of D-PrAE via examples on the safety-testing of self-driving algorithms that are beyond the reach of classical variance reduction techniques.


Kendall transformation

arXiv.org Machine Learning

Kendall transformation is a conversion of an ordered feature into a vector of pairwise order relations between individual values. This way, it preserves ranking of observations and represents it in a categorical form. Such transformation allows for generalisation of methods requiring strictly categorical input, especially in the limit of small number of observations, when discretisation becomes problematic. In particular, many approaches of information theory can be directly applied to Kendall-transformed continuous data without relying on differential entropy or any additional parameters. Moreover, by filtering information to this contained in ranking, Kendall transformation leads to a better robustness at a reasonable cost of dropping sophisticated interactions which are anyhow unlikely to be correctly estimated. In bivariate analysis, Kendall transformation can be related to popular non-parametric methods, showing the soundness of the approach. The paper also demonstrates its efficiency in multivariate problems, as well as provides an example analysis of a real-world data.


Deep Doubly Supervised Transfer Network for Diagnosis of Breast Cancer with Imbalanced Ultrasound Imaging Modalities

arXiv.org Machine Learning

Elastography ultrasound (EUS) provides additional bio-mechanical information about lesion for B-mode ultrasound (BUS) in the diagnosis of breast cancers. However, joint utilization of both BUS and EUS is not popular due to the lack of EUS devices in rural hospitals, which arouses a novel modality imbalance problem in computer-aided diagnosis (CAD) for breast cancers. Current transfer learning (TL) pay little attention to this special issue of clinical modality imbalance, that is, the source domain (EUS modality) has fewer labeled samples than those in the target domain (BUS modality). Moreover, these TL methods cannot fully use the label information to explore the intrinsic relation between two modalities and then guide the promoted knowledge transfer. To this end, we propose a novel doubly supervised TL network (DDSTN) that integrates the Learning Using Privileged Information (LUPI) paradigm and the Maximum Mean Discrepancy (MMD) criterion into a unified deep TL framework. The proposed algorithm can not only make full use of the shared labels to effectively guide knowledge transfer by LUPI paradigm, but also perform additional supervised transfer between unpaired data. We further introduce the MMD criterion to enhance the knowledge transfer. The experimental results on the breast ultrasound dataset indicate that the proposed DDSTN outperforms all the compared state-of-the-art algorithms for the BUSbased CAD. Keywords: Ultrasound imaging, Breast cancer, Deep doubly supervised transfer learning, Support vector machine plus, Maximum mean discrepancy.


Random Partitioning Forest for Point-Wise and Collective Anomaly Detection -- Application to Intrusion Detection

arXiv.org Machine Learning

In this paper, we propose DiFF-RF, an ensemble approach composed of random partitioning binary trees to detect point-wise and collective (as well as contextual) anomalies. Thanks to a distance-based paradigm used at the leaves of the trees, this semi-supervised approach solves a drawback that has been identified in the isolation forest (IF) algorithm. Moreover, taking into account the frequencies of visits in the leaves of the random trees allows to significantly improve the performance of DiFF-RF when considering the presence of collective anomalies. DiFF-RF is fairly easy to train, and excellent performance can be obtained by using a simple semi-supervised procedure to setup the extra hyper-parameter that is introduced. We first evaluate DiFF-RF on a synthetic data set to i) verify that the limitation of the IF algorithm is overcome, ii) demonstrate how collective anomalies are actually detected and iii) to analyze the effect of the meta-parameters it involves. We assess the DiFF-RF algorithm on a large set of datasets from the UCI repository, as well as two benchmarks related to intrusion detection applications. Our experiments show that DiFF-RF almost systematically outperforms the IF algorithm, but also challenges the one-class SVM baseline and a deep learning variational auto-encoder architecture. Furthermore, our experience shows that DiFF-RF can work well in the presence of small-scale learning data, which is conversely difficult for deep neural architectures. Finally, DiFF-RF is computationally efficient and can be easily parallelized on multi-core architectures.


Harnessing Adversarial Distances to Discover High-Confidence Errors

arXiv.org Machine Learning

Given a deep neural network image classification model that we treat as a black box, and an unlabeled evaluation dataset, we develop an efficient strategy by which the classifier can be evaluated. Randomly sampling and labeling instances from an unlabeled evaluation dataset allows traditional performance measures like accuracy, precision, and recall to be estimated. However, random sampling may miss rare errors for which the model is highly confident in its prediction, but wrong. These high-confidence errors can represent costly mistakes, and therefore should be explicitly searched for. Past works have developed search techniques to find classification errors above a specified confidence threshold, but ignore the fact that errors should be expected at confidence levels anywhere below 100\%. In this work, we investigate the problem of finding errors at rates greater than expected given model confidence. Additionally, we propose a query-efficient and novel search technique that is guided by adversarial perturbations to find these mistakes in black box models. Through rigorous empirical experimentation, we demonstrate that our Adversarial Distance search discovers high-confidence errors at a rate greater than expected given model confidence.


Data Science questions for interview prep (Machine Learning Concepts) -Part I

#artificialintelligence

I recently finished watching this Machine Learning playlist (StatQuest by Josh Starmer) on Youtube and thought of summarizing each concept into a Q/A. As I prepare for more data science interviews, I thought it would be a good exercise to make sure that I am communicating my thoughts clearly and concisely during an interview. Let me know in the comments, if I am not doing a good job in explaining any of the concepts. NOTE: This article is not aimed for teaching a concept to beginners. It assumes that the reader has sufficient background in data science concepts.


Causal Explanations of Image Misclassifications

arXiv.org Machine Learning

The causal explanation of image misclassifications is an understudied niche, which can potentially provide valuable insights in model interpretability and increase prediction accuracy. This study trains CIFAR-10 on six modern CNN architectures, including VGG16, ResNet50, GoogLeNet, DenseNet161, MobileNet V2, and Inception V3, and explores the misclassification patterns using conditional confusion matrices and misclassification networks. Two causes are identified and qualitatively distinguished: morphological similarity and non-essential information interference. The former cause is not model dependent, whereas the latter is inconsistent across all six models. To reduce the misclassifications caused by non-essential information interference, this study erases the pixels within the bonding boxes anchored at the top 5% pixels of the saliency map. This method first verifies the cause; then by directly modifying the cause it reduces the misclassification. Future studies will focus on quantitatively differentiating the two causes of misclassifications, generalizing the anchor-box based inference modification method to reduce misclassification, exploring the interactions of the two causes in misclassifications.


Abolish the #TechToPrisonPipeline

#artificialintelligence

The authors of the Harrisburg University study make explicit their desire to provide "a significant advantage for law enforcement agencies and other intelligence agencies to prevent crime" as a co-author and former NYPD police officer outlined in the original press release.[38] At a time when the legitimacy of the carceral state, and policing in particular, is being challenged on fundamental grounds in the United States, there is high demand in law enforcement for research of this nature, research which erases historical violence and manufactures fear through the so-called prediction of criminality. Publishers and funding agencies serve a crucial role in feeding this ravenous maw by providing platforms and incentives for such research. The circulation of this work by a major publisher like Springer would represent a significant step towards the legitimation and application of repeatedly debunked, socially harmful research in the real world. To reiterate our demands, the review committee must publicly rescind the offer for publication of this specific study, along with an explanation of the criteria used to evaluate it. Springer must issue a statement condemning the use of criminal justice statistics to predict criminality and acknowledging their role in incentivizing such harmful scholarship in the past. Finally, all publishers must refrain from publishing similar studies in the future.


A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models

arXiv.org Machine Learning

Deep Neural Networks are well known to be vulnerable to adversarial attacks and backdoor attacks, where minor modifications on the input can mislead the models to give wrong results. Although defenses against adversarial attacks have been widely studied, research on mitigating backdoor attacks is still at an early stage. It is unknown whether there are any connections and common characteristics between the defenses against these two attacks. In this paper, we present a unified framework for detecting malicious examples and protecting the inference results of Deep Learning models. This framework is based on our observation that both adversarial examples and backdoor examples have anomalies during the inference process, highly distinguishable from benign samples. As a result, we repurpose and revise four existing adversarial defense methods for detecting backdoor examples. Extensive evaluations indicate these approaches provide reliable protection against backdoor attacks, with a higher accuracy than detecting adversarial examples. These solutions also reveal the relations of adversarial examples, backdoor examples and normal samples in model sensitivity, activation space and feature space. This can enhance our understanding about the inherent features of these two attacks, as well as the defense opportunities.