Statistical Machine Translation Is a Natural Fit for Automatic Identifier Renaming in Software Source Code

Lacomis, Jeremy (Carnegie Mellon University) | Jaffe, Alan (Carnegie Mellon University) | Schwartz, Edward J. (Carnegie Mellon University) | Goues, Claire Le (Carnegie Mellon University) | Vasilescu, Bogdan (Carnegie Mellon University)

Apr-6-2018–AAAI Conferences

Advances in natural language processing have led to a variety of successful tools and techniques for solving problems such as understanding, generating, and translating natural languages. Given the success of these techniques, a natural question is whether they can also be applied to programming languages. However, the initial research has been mixed. Researchers attempting to translate between programming languages by employing statistical machine translation (SMT) found that a large percentage of the translated programs were not syntactically valid. On the other hand, SMT has been successfully employed to recover identifiers in obfuscated JavaScript code. In this paper, we discuss several differences between natural languages and programming languages that can thwart successful application of NLP techniques to program transformation. We also discuss several strategies to cope with these differences in practice, using our own experiences with using SMT to assign meaningful identifier names to variables in decompiled C programs as an example.

artificial intelligence, natural language, statistical machine translation, (3 more...)

AAAI Conferences

Apr-6-2018

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.60)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found