Hossain, Khondoker Murad
DeBUGCN -- Detecting Backdoors in CNNs Using Graph Convolutional Networks
Vartak, Akash, Hossain, Khondoker Murad, Oates, Tim
Deep neural networks (DNNs) are becoming commonplace in critical applications, making their susceptibility to backdoor (trojan) attacks a significant problem. In this paper, we introduce a novel backdoor attack detection pipeline, detecting attacked models using graph convolution networks (DeBUGCN). To the best of our knowledge, ours is the first use of GCNs for trojan detection. We use the static weights of a DNN to create a graph structure of its layers. A GCN is then used as a binary classifier on these graphs, yielding a trojan or clean determination for the DNN. To demonstrate the efficacy of our pipeline, we train hundreds of clean and trojaned CNN models on the MNIST handwritten digits and CIFAR-10 image datasets, and show the DNN classification results using DeBUGCN. For a true In-the-Wild use case, our pipeline is evaluated on the TrojAI dataset which consists of various CNN architectures, thus showing the robustness and model-agnostic behaviour of DeBUGCN. Furthermore, on comparing our results on several datasets with state-of-the-art trojan detection algorithms, DeBUGCN is faster and more accurate.
TEN-GUARD: Tensor Decomposition for Backdoor Attack Detection in Deep Neural Networks
Hossain, Khondoker Murad, Oates, Tim
As deep neural networks and the datasets used to train them get larger, the default approach to integrating them into research and commercial projects is to download a pre-trained model and fine tune it. But these models can have uncertain provenance, opening up the possibility that they embed hidden malicious behavior such as trojans or backdoors, where small changes to an input (triggers) can cause the model to produce incorrect outputs (e.g., to misclassify). This paper introduces a novel approach to backdoor detection that uses two tensor decomposition methods applied to network activations. This has a number of advantages relative to existing detection methods, including the ability to analyze multiple models at the same time, working across a wide variety of network architectures, making no assumptions about the nature of triggers used to alter network behavior, and being computationally efficient. We provide a detailed description of the detection pipeline along with results on models trained on the MNIST digit dataset, CIFAR-10 dataset, and two difficult datasets from NIST's TrojAI competition. These results show that our method detects backdoored networks more accurately and efficiently than current state-of-the-art methods.
Backdoor Attack Detection in Computer Vision by Applying Matrix Factorization on the Weights of Deep Networks
Hossain, Khondoker Murad, Oates, Tim
The increasing importance of both deep neural networks (DNNs) and cloud services for training them means that bad actors have more incentive and opportunity to insert backdoors to alter the behavior of trained models. In this paper, we introduce a novel method for backdoor detection that extracts features from pre-trained DNN's weights using independent vector analysis (IVA) followed by a machine learning classifier. In comparison to other detection techniques, this has a number of benefits, such as not requiring any training data, being applicable across domains, operating with a wide range of network architectures, not assuming the nature of the triggers used to change network behavior, and being highly scalable. We discuss the detection pipeline, and then demonstrate the results on two computer vision datasets regarding image classification and object detection. Our method outperforms the competing algorithms in terms of efficiency and is more accurate, helping to ensure the safe application of deep learning and AI.