Mukhopadhyay, Debdeep
Guardian of the Ensembles: Introducing Pairwise Adversarially Robust Loss for Resisting Adversarial Attacks in DNN Ensembles
Shukla, Shubhi, Dalui, Subhadeep, Alam, Manaar, Datta, Shubhajit, Mondal, Arijit, Mukhopadhyay, Debdeep, Chakrabarti, Partha Pratim
Adversarial attacks rely on transferability, where an adversarial example (AE) crafted on a surrogate classifier tends to mislead a target classifier. Recent ensemble methods demonstrate that AEs are less likely to mislead multiple classifiers in an ensemble. This paper proposes a new ensemble training using a Pairwise Adversarially Robust Loss (PARL) that by construction produces an ensemble of classifiers with diverse decision boundaries. PARL utilizes outputs and gradients of each layer with respect to network parameters in every classifier within the ensemble simultaneously. PARL is demonstrated to achieve higher robustness against black-box transfer attacks than previous ensemble methods as well as adversarial training without adversely affecting clean example accuracy. Extensive experiments using standard Resnet20, WideResnet28-10 classifiers demonstrate the robustness of PARL against state-of-the-art adversarial attacks. While maintaining similar clean accuracy and lesser training time, the proposed architecture has a 24.8% increase in robust accuracy ($\epsilon$ = 0.07) from the state-of-the art method.
Stealing the Invisible: Unveiling Pre-Trained CNN Models through Adversarial Examples and Timing Side-Channels
Shukla, Shubhi, Alam, Manaar, Mitra, Pabitra, Mukhopadhyay, Debdeep
Machine learning, with its myriad applications, has become an integral component of numerous technological systems. A common practice in this domain is the use of transfer learning, where a pre-trained model's architecture, readily available to the public, is fine-tuned to suit specific tasks. As Machine Learning as a Service (MLaaS) platforms increasingly use pre-trained models in their backends, it's crucial to safeguard these architectures and understand their vulnerabilities. In this work, we present an approach based on the observation that the classification patterns of adversarial images can be used as a means to steal the models. Furthermore, the adversarial image classifications in conjunction with timing side channels can lead to a model stealing method. Our approach, designed for typical user-level access in remote MLaaS environments exploits varying misclassifications of adversarial images across different models to fingerprint several renowned Convolutional Neural Network (CNN) and Vision Transformer (ViT) architectures. We utilize the profiling of remote model inference times to reduce the necessary adversarial images, subsequently decreasing the number of queries required. We have presented our results over 27 pre-trained models of different CNN and ViT architectures using CIFAR-10 dataset and demonstrate a high accuracy of 88.8% while keeping the query budget under 20.
Deep-Lock: Secure Authorization for Deep Neural Networks
Alam, Manaar, Saha, Sayandeep, Mukhopadhyay, Debdeep, Kundu, Sandip
Trained Deep Neural Network (DNN) models are considered valuable Intellectual Properties (IP) in several business models. Prevention of IP theft and unauthorized usage of such DNN models has been raised as of significant concern by industry. In this paper, we address the problem of preventing unauthorized usage of DNN models by proposing a generic and lightweight key-based model-locking scheme, which ensures that a locked model functions correctly only upon applying the correct secret key. The proposed scheme, known as Deep-Lock, utilizes S-Boxes with good security properties to encrypt each parameter of a trained DNN model with secret keys generated from a master key via a key scheduling algorithm. The resulting dense network of encrypted weights is found robust against model fine-tuning attacks. Finally, Deep-Lock does not require any intervention in the structure and training of the DNN models, making it applicable for all existing software and hardware implementations of DNN.
Enhancing Fault Tolerance of Neural Networks for Security-Critical Applications
Alam, Manaar, Bag, Arnab, Roy, Debapriya Basu, Jap, Dirmanto, Breier, Jakub, Bhasin, Shivam, Mukhopadhyay, Debdeep
Neural Networks (NN) have recently emerged as backbone of several sensitive applications like automobile, medical image, security, etc. NNs inherently offer Partial Fault Tolerance (PFT) in their architecture; however, the biased PFT of NNs can lead to severe consequences in applications like cryptography and security critical scenarios. In this paper, we propose a revised implementation which enhances the PFT property of NN significantly with detailed mathematical analysis. We evaluated the performance of revised NN considering both software and FPGA implementation for a cryptographic primitive like AES SBox. The results show that the PFT of NNs can be significantly increased with the proposed methodology.
How Secure are Deep Learning Algorithms from Side-Channel based Reverse Engineering?
Alam, Manaar, Mukhopadhyay, Debdeep
Deep Learning algorithms have recently become the de-facto paradigm for various prediction problems, which include many privacy-preserving applications like online medical image analysis. Presumably, the privacy of data in a deep learning system is a serious concern. There have been several efforts to analyze and exploit the information leakages from deep learning architectures to compromise data privacy. In this paper, however, we attempt to provide an evaluation strategy for such information leakages through deep neural network architectures by considering a case study on Convolutional Neural Network (CNN) based image classifier. The approach takes the aid of low-level hardware information, provided by Hardware Performance Counters (HPCs), during the execution of a CNN classifier and a simple hypothesis testing in order to produce an alarm if there exists any information leakage on the actual input.
Adversarial Attacks and Defences: A Survey
Chakraborty, Anirban, Alam, Manaar, Dey, Vishal, Chattopadhyay, Anupam, Mukhopadhyay, Debdeep
Deep learning has emerged as a strong and efficient framework that can be applied to a broad spectrum of complex learning problems which were difficult to solve using the traditional machine learning techniques in the past. In the last few years, deep learning has advanced radically in such a way that it can surpass human-level performance on a number of tasks. As a consequence, deep learning is being extensively used in most of the recent day-to-day applications. However, security of deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify the output. In recent times, different types of adversaries based on their threat model leverage these vulnerabilities to compromise a deep learning system where adversaries have high incentives. Hence, it is extremely important to provide robustness to deep learning algorithms against these adversaries. However, there are only a few strong countermeasures which can be used in all types of attack scenarios to design a robust deep learning system. In this paper, we attempt to provide a detailed discussion on different types of adversarial attacks with various threat models and also elaborate the efficiency and challenges of recent countermeasures against them.