Function Naming in Stripped Binaries Using Neural Networks
Artuso, Fiorella, Di Luna, Giuseppe Antonio, Massarelli, Luca, Querzoni, Leonardo
Abstract--In this paper we investigate the problem of automatically naming pieces of assembly code. Where by naming we mean assigning to portion of code the string of words that wou ld be likely assigned by an human reverse engineer . We formally and precisely define the framework in which our investigatio n takes place. That is we define problem, we provide reasonable justifications for the choice that we made during our designi ng of the training and test steps and we performed a statistical an alysis of function names in a large real-world corpora of over 4 mill ions of functions. In such framework we test several baselines co ming from the field of NLP (e.g., Seq2Seq networks and transformer s). Moreover, we provide a set of tailored solutions that beat th e aforementioned baselines. Last few years have witnessed the growth of a trend consisting in the application of machine learning (ML) and natural language processing (NLP) techniques to the code, as illustrated in [14].
Dec-17-2019
- Country:
- Africa > Middle East > Egypt > Aswan Governorate > Aswan (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology > Security & Privacy (0.68)
- Technology: