On the Veracity of Cyber Intrusion Alerts Synthesized by Generative Adversarial Networks

Sweet, Christopher, Moskal, Stephen, Yang, Shanchieh Jay

Aug-3-2019–arXiv.org Machine Learning

--Recreating cyber-attack alert data with a high level of fidelity is challenging due to the intricate interaction between features, non-homogeneity of alerts, and potential for rare yet critical samples. Generative Adversarial Networks (GANs) have been shown to effectively learn complex data distributions with the intent of creating increasingly realistic data. This paper presents the application of GANs to cyber-attack alert data and shows that GANs not only successfully learn to generate realistic alerts, but also reveal feature dependencies within alerts. This is accomplished by reviewing the intersection of histograms for varying alert-feature combinations between the ground truth and generated datsets. Traditional statistical metrics, such as conditional and joint entropy, are also employed to verify the accuracy of these dependencies. Finally, it is shown that a Mutual Information constraint on the network can be used to increase the generation of low probability, critical, alert values. By mapping alerts to a set of attack stages it is shown that the output of these low probability alerts has a direct contextual meaning for Cyber Security analysts. Overall, this work provides the basis for generating new cyber intrusion alerts and provides evidence that synthesized alerts emulate critical dependencies from the source dataset. I NTRODUCTION Classifying, predicting, and generating cyber-attack alert data provides a unique set of challenges due to imbalance and a lack of homogeneity in alert datasets. Furthering these challenges critical exploits in a network are often rare and difficult to identify. Despite this is has been shown that alert data can be used to identify anomalous traffic [1] [2] [3], network vulnerabilities [4], and bad actor behavior profiling [5]. However, to fully realize the potential of cyber-attack alert data, a means to acquire more data and analyze critical dependencies within alerts is needed. This work seeks to provide solutions to these challenges by showing that deep learning models are able to recreate cyber-attack alert data when given representative real world data. This includes a means for driving better coverage of the feature domain in model outputs, allowing more rare but critical events to be synthesized.

alert, dataset, dependency, (13 more...)

arXiv.org Machine Learning

Aug-3-2019

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - New South Wales > Sydney (0.14)
- North America
  - United States
    - Tennessee (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Hawaii > Honolulu County
      - Honolulu (0.04)
    - California
      - San Francisco County > San Francisco (0.14)
      - Los Angeles County > Long Beach (0.04)
      - Santa Clara County > Palo Alto (0.04)
  - Canada > Alberta
    - Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)
- Europe
  - Italy (0.04)
  - Sweden > Stockholm
    - Stockholm (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
- Asia
  - Taiwan > Takao Province
    - Kaohsiung (0.04)
  - Myanmar > Mandalay Region
    - Mandalay (0.04)
  - China > Beijing
    - Beijing (0.04)

Genre:
- Research Report (0.64)

Industry:
- Information Technology > Security & Privacy (1.00)
- Government > Military
  - Cyberwarfare (1.00)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence > Machine Learning
    - Neural Networks > Deep Learning (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found