Bayesian Networks with Prior Knowledge for Malware Phylogenetics

Oyen, Diane (Los Alamos National Laboratory) | Anderson, Blake (Cisco Systems, Inc) | Anderson-Cook, Christine (Los Alamos National Laboratory)

AAAI Conferences 

Malware phylogenetics help cybersecurity experts to quickly understand a new malware sample by placing the new sample in the context of similar samples that have been previously reverse engineered. Recently, researchers have begun using malware code as data to infer directed acyclic graphs (DAG) that model the evolutionary relationships among samples of malware. A DAG is the ideal model for a phylogenetic graph because it includes the merges and branches that are often present in malware evolution. We present a novel Bayesian network discovery algorithm for learning a DAG via statistical inference of conditional dependencies from observed data with an informative prior on the partial ordering of variables. Our approach leverages the information on edge direction that a human can provide and the edge presence inference which data can provide. We give an efficient implementation of the partial-order prior in a Bayesian structure discovery learning algorithm, as well as a related structure prior, showing that both priors meet the local modularity requirement necessary for the efficient Bayesian discovery algorithm. We apply our algorithm to learn phylogenetic graphs on three malicious families and two benign families where the ground truth is known; and show that compared to competing algorithms, our algorithm more accurately identifies directed edges.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found