Identifying Relevant Features of CSE-CIC-IDS2018 Dataset for the Development of an Intrusion Detection System
Göcs, László, Johanyák, Zsolt Csaba
–arXiv.org Artificial Intelligence
Intrusion detection systems (IDSs) are essential elements of IT systems. Their key component is a classification module that continuously evaluates some features of the network traffic and identifies possible threats. Its efficiency is greatly affected by the right selection of the features to be monitored. Therefore, the identification of a minimal set of features that are necessary to safely distinguish malicious traffic from benign traffic is indispensable in the course of the development of an IDS. This paper presents the preprocessing and feature selection workflow as well as its results in the case of the CSE-CIC-IDS2018 on AWS dataset, focusing on five attack types. To identify the relevant features, six feature selection methods were applied, and the final ranking of the features was elaborated based on their average score. Next, several subsets of the features were formed based on different ranking threshold values, and each subset was tried with five classification algorithms to determine the optimal feature set for each attack type. During the evaluation, four widely used metrics were taken into consideration.
arXiv.org Artificial Intelligence
Jul-21-2023
- Country:
- North America > United States
- New York (0.04)
- Texas > Travis County
- Austin (0.04)
- Europe > Hungary
- Bács-Kiskun County > Kecskemét (0.04)
- Asia > Japan
- Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- North America > United States
- Genre:
- Research Report (1.00)
- Industry:
- Technology:
- Information Technology
- Security & Privacy (1.00)
- Communications > Networks (1.00)
- Artificial Intelligence
- Representation & Reasoning > Uncertainty
- Bayesian Inference (0.93)
- Fuzzy Logic (0.67)
- Machine Learning
- Statistical Learning (1.00)
- Performance Analysis > Accuracy (1.00)
- Neural Networks (0.93)
- Learning Graphical Models > Directed Networks
- Bayesian Learning (1.00)
- Representation & Reasoning > Uncertainty
- Information Technology