Learning C to x86 Translation: An Experiment in Neural Compilation
Armengol-Estapé, Jordi, O'Boyle, Michael F. P.
–arXiv.org Artificial Intelligence
Machine learning based compilation has been explored for over a decade [1]. Early work focused on learning profitability heuristics while more recently, deep learning models have been used to build code-to-code models, for translating or decompiling code. However, to the best of our knowledge, there has been no prior work on using machine learning to entirely automate compilation i.e given a high level source code program generate the equivalent assembler code. In this paper, we investigate whether it is possible to learn an end-to-end machine compiler using neural machine translation. In particular, we focus on the translation of small C functions to x86 assembler We use an existing function-level C corpus, Anghabench [2], to build a parallel C-x86 assembler corpus.
arXiv.org Artificial Intelligence
Aug-17-2021