Goto

Collaborating Authors

 Taheri, Rahim


Federated Learning Under Attack: Exposing Vulnerabilities through Data Poisoning Attacks in Computer Networks

arXiv.org Artificial Intelligence

Federated Learning (FL) is a machine learning (ML) approach that enables multiple decentralized devices or edge servers to collaboratively train a shared model without exchanging raw data. During the training and sharing of model updates between clients and servers, data and models are susceptible to different data-poisoning attacks. In this study, our motivation is to explore the severity of data poisoning attacks in the computer network domain because they are easy to implement but difficult to detect. We considered two types of data-poisoning attacks, label flipping (LF) and feature poisoning (FP), and applied them with a novel approach. In LF, we randomly flipped the labels of benign data and trained the model on the manipulated data. For FP, we randomly manipulated the highly contributing features determined using the Random Forest algorithm. The datasets used in this experiment were CIC and UNSW related to computer networks. We generated adversarial samples using the two attacks mentioned above, which were applied to a small percentage of datasets. Subsequently, we trained and tested the accuracy of the model on adversarial datasets. We recorded the results for both benign and manipulated datasets and observed significant differences between the accuracy of the models on different datasets. From the experimental results, it is evident that the LF attack failed, whereas the FP attack showed effective results, which proved its significance in fooling a server. With a 1% LF attack on the CIC, the accuracy was approximately 0.0428 and the ASR was 0.9564; hence, the attack is easily detectable, while with a 1% FP attack, the accuracy and ASR were both approximately 0.9600, hence, FP attacks are difficult to detect. We repeated the experiment with different poisoning percentages.


On Defending Against Label Flipping Attacks on Malware Detection Systems

arXiv.org Artificial Intelligence

Label manipulation attacks are a subclass of data poisoning attacks in adversarial machine learning used against different applications, such as malware detection. These types of attacks represent a serious threat to detection systems in environments having high noise rate or uncertainty, such as complex networks and Internet of Thing (IoT). Recent work in the literature has suggested using the $K$-Nearest Neighboring (KNN) algorithm to defend against such attacks. However, such an approach can suffer from low to wrong detection accuracy. In this paper, we design an architecture to tackle the Android malware detection problem in IoT systems. We develop an attack mechanism based on Silhouette clustering method, modified for mobile Android platforms. We proposed two Convolutional Neural Network (CNN)-type deep learning algorithms against this \emph{Silhouette Clustering-based Label Flipping Attack (SCLFA)}. We show the effectiveness of these two defense algorithms - \emph{Label-based Semi-supervised Defense (LSD)} and \emph{clustering-based Semi-supervised Defense (CSD)} - in correcting labels being attacked. We evaluate the performance of the proposed algorithms by varying the various machine learning parameters on three Android datasets: Drebin, Contagio, and Genome and three types of features: API, intent, and permission. Our evaluation shows that using random forest feature selection and varying ratios of features can result in an improvement of up to 19\% accuracy when compared with the state-of-the-art method in the literature.


Similarity-based Android Malware Detection Using Hamming Distance of Static Binary Features

arXiv.org Machine Learning

In this paper, we develop four malware detection methods using Hamming distance to find similarity between samples which are first nearest neighbors (FNN), all nearest neighbors (ANN), weighted all nearest neighbors (WANN), and k-medoid based nearest neighbors (KMNN). In our proposed methods, we can trigger the alarm if we detect an Android app is malicious. Hence, our solutions help us to avoid the spread of detected malware on a broader scale. We provide a detailed description of the proposed detection methods and related algorithms. We include an extensive analysis to asses the suitability of our proposed similarity-based detection methods. In this way, we perform our experiments on three datasets, including benign and malware Android apps like Drebin, Contagio, and Genome. Thus, to corroborate the actual effectiveness of our classifier, we carry out performance comparisons with some state-of-the-art classification and malware detection algorithms, namely Mixed and Separated solutions, the program dissimilarity measure based on entropy (PDME) and the FalDroid algorithms. We test our experiments in a different type of features: API, intent, and permission features on these three datasets. The results confirm that accuracy rates of proposed algorithms are more than 90% and in some cases (i.e., considering API features) are more than 99%, and are comparable with existing state-of-the-art solutions.


Can Machine Learning Model with Static Features be Fooled: an Adversarial Machine Learning Approach

arXiv.org Artificial Intelligence

Applied Intelligence manuscript No. (will be inserted by the editor) Abstract The widespread adoption of smartphones dramaticallygenerated by our attacks models when used to harden increases the risk of attacks and the spread the developed anti-malware system improves the detection of mobile malware, especially on the Android platform. Machine learning based solutions have been already Keywords Adversarial machine learning · malware used as a tool to supersede signature based anti-malware detection · poison attacks · adversarial example · systems. However, malware authors leverage attributes jacobian algorithm. Hence, to evaluate the vulnerability of machine 1 Introduction learning algorithms in malware detection, we propose five different attack scenarios to perturb malicious applications Nowadays using the Android application is very popular (apps). Every Android application inappropriately fits discriminant function on has a Jar-like APK format and is an archive file which the set of data points, eventually yielding a higher misclassification contains Android manifest and Classes.dex Further, to distinguish the adversarial manifest file holds information about the application examples from benign samples, we propose two defense structure and each part responsible for certain actions. To validate our For instance, the requested permissions must be accepted attacks and solutions, we test our model on three different by the users for successful installation of applications. We also test our methods The manifest file contains a list of hardware using various classifier algorithms and compare them components and permissions required by each application. Promising results show that generated the manifest file that are useful for running applications. Additionally, evasive variants is saved as the classes.dex In a nutshell, the by presenting some adversary-aware approaches?generated malware sample is statistically identical to a Do we require retraining of the current ML model to designbenign sample. To do so, adversaries adopt adversarial adversary-aware learning algorithms? How to properlymachine learning algorithms (AML) to design an example test and validate the countermeasure solutions inset called poison data which is used to fool machine a real-world network? The goal of this paper is to shedlearning models.