Function Naming in Stripped Binaries Using Neural Networks

Artuso, Fiorella, Di Luna, Giuseppe Antonio, Massarelli, Luca, Querzoni, Leonardo

arXiv.org Machine Learning 

Abstract--In this paper we investigate the problem of automatically naming pieces of assembly code. Where by naming we mean assigning to portion of code the string of words that wou ld be likely assigned by an human reverse engineer . We formally and precisely define the framework in which our investigatio n takes place. That is we define problem, we provide reasonable justifications for the choice that we made during our designi ng of the training and test steps and we performed a statistical an alysis of function names in a large real-world corpora of over 4 mill ions of functions. In such framework we test several baselines co ming from the field of NLP (e.g., Seq2Seq networks and transformer s). Moreover, we provide a set of tailored solutions that beat th e aforementioned baselines. Last few years have witnessed the growth of a trend consisting in the application of machine learning (ML) and natural language processing (NLP) techniques to the code, as illustrated in [14].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found