Towards Learning Representations of Binary Executable Files for Security Tasks
Arakelyan, Shushan, Hauser, Christophe, Kline, Erik, Galstyan, Aram
Tackling binary analysis problems has traditionally implied manually defining rules and heuristics. As an alternative, we are suggesting using machine learning models for learning distributed representations of binaries that can be applicable for a number of downstream tasks. We construct a computational graph from the binary executable and use it with a graph convolutional neural network to learn a high dimensional representation of the program. We show the versatility of this approach by using our representations to solve two semantically different binary analysis tasks -- algorithm classification and vulnerability discovery. We compare the proposed approach to our own strong baseline as well as published results and demonstrate improvement on the state of the art methods for both tasks.
Feb-9-2020
- Country:
- Asia > Vietnam
- Europe
- North America
- Canada
- United States
- Arizona > Maricopa County
- Phoenix (0.04)
- California
- Los Angeles County > Long Beach (0.04)
- San Diego County > San Diego (0.04)
- San Francisco County > San Francisco (0.04)
- Santa Clara County > San Jose (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Tennessee > Anderson County
- Oak Ridge (0.04)
- Texas
- Dallas County > Dallas (0.04)
- Travis County > Austin (0.04)
- Arizona > Maricopa County
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology: