An Anomaly Contribution Explainer for Cyber-Security Applications
Zhang, Xiao, Marwah, Manish, Lee, I-ta, Arlitt, Martin, Goldwasser, Dan
--In this paper we introduce Anomaly Contribution Explainer or ACE, a tool to explain security anomaly detection models in terms of the model features through a regression framework, and its variant, ACE-KL, which highlights the important anomaly contributors. ACE and ACE-KL provide insights in diagnosing which attributes significantly contribute to an anomaly by building a specialized linear model to locally approximate the anomaly score that a black-box model generates. We conducted experiments with these anomaly detection models to detect security anomalies on both synthetic data and real data. In particular, we evaluate performance on three public data sets: CERT insider threat, netflow logs, and Android malware. The experimental results are encouraging: our methods consistently identify the correct contributing feature in the synthetic data where ground truth is available; similarly, for real data sets, our methods point a security analyst in the direction of the underlying causes of an anomaly, including in one case leading to the discovery of previously overlooked network scanning activity. We have made our source code publicly available. Cyber-security is a key concern for both private and public organizations, given the high cost of security compromises and attacks; malicious cyber-activity cost the U.S. economy between $57 billion and $109 billion in 2016 [1]. As a result, spending on security research and development, and security products and services to detect and combat cyber-attacks has been increasing [2]. Organizations produce large amounts of network, host and application data that can be used to gain insights into cyber-security threats, misconfigurations, and network operations. While security domain experts can manually sift through some amount of data to spot attacks and understand them, it is virtually impossible to do so at scale, considering that even a medium sized enterprise can produce terabytes of data in a few hours.
Nov-30-2019
- Country:
- Asia > China (0.04)
- North America
- United States (0.34)
- Canada (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Genre:
- Research Report > New Finding (0.93)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Government > Military
- Cyberwarfare (0.34)
- Technology: