Hamidouche, Mounia
Enhancing IoT Security via Automatic Network Traffic Analysis: The Transition from Machine Learning to Deep Learning
Hamidouche, Mounia, Popko, Eugeny, Ouni, Bassem
This work provides a comparative analysis illustrating how Deep Learning (DL) surpasses Machine Learning (ML) in addressing tasks within Internet of Things (IoT), such as attack classification and device-type identification. Our approach involves training and evaluating a DL model using a range of diverse IoT-related datasets, allowing us to gain valuable insights into how adaptable and practical these models can be when confronted with various IoT configurations. We initially convert the unstructured network traffic data from IoT networks, stored in PCAP files, into images by processing the packet data. This conversion process adapts the data to meet the criteria of DL classification methods. The experiments showcase the ability of DL to surpass the constraints tied to manually engineered features, achieving superior results in attack detection and maintaining comparable outcomes in device-type identification. Additionally, a notable feature extraction time difference becomes evident in the experiments: traditional methods require around 29 milliseconds per data packet, while DL accomplishes the same task in just 2.9 milliseconds. The significant time gap, DL's superior performance, and the recognized limitations of manually engineered features, presents a compelling call to action within the IoT community. This encourages us to shift from exploring new IoT features for each dataset to addressing the challenges of integrating DL into IoT, making it a more efficient solution for real-world IoT scenarios.
Graphs as Tools to Improve Deep Learning Methods
Lassance, Carlos, Bontonou, Myriam, Hamidouche, Mounia, Pasdeloup, Bastien, Drumetz, Lucas, Gripon, Vincent
In recent years, deep neural networks (DNNs) have known an important rise in popularity. However, although they are state-of-the-art in many machine learning challenges, they still suffer from several limitations. For example, DNNs require a lot of training data, which might not be available in some practical applications. In addition, when small perturbations are added to the inputs, DNNs are prone to misclassification errors. DNNs are also viewed as black-boxes and as such their decisions are often criticized for their lack of interpretability. In this chapter, we review recent works that aim at using graphs as tools to improve deep learning methods. These graphs are defined considering a specific layer in a deep learning architecture. Their vertices represent distinct samples, and their edges depend on the similarity of the corresponding intermediate representations. These graphs can then be leveraged using various methodologies, many of which built on top of graph signal processing. This chapter is composed of four main parts: tools for visualizing intermediate layers in a DNN, denoising data representations, optimizing graph objective functions and regularizing the learning process.
Graph Filtering for Improving the Accuracy of Classification problems
Hamidouche, Mounia, Lassance, Carlos, Hu, Yuqing, Drumetz, Lucas, Pasdeloup, Bastien, Gripon, Vincent
In machine learning, classifiers are typically susceptible to noise in the training data. In this work, we aim at reducing intra-class noise with the help of graph filtering to improve the classification performance. Considered graphs are obtained by connecting samples of the training set that belong to a same class depending on the similarity of their representation in a latent space. As a matter of fact, by looking at the features in latent representations of samples as graph signals, it is possible to filter them in order to remove high frequencies, thus improving the signal-to-noise ratio. A consequence is that intra-class variance gets smaller, while mean remains the same, as shown theoretically in this article. We support this analysis through experimental evaluation of the graph filtering impact on the accuracy of multiple standard benchmarks of the field. While our approach applies to all classification problems in general, it is particularly useful in few-shot settings, where intra-class noise has a huge impact due to initial sample selection.