code literal
CodeCMR: Cross-Modal Retrieval For Function-Level Binary Source Code Matching
Binary source code matching, especially on function-level, has a critical role in the field of computer security. Given binary code only, finding the corresponding source code improves the accuracy and efficiency in reverse engineering. Given source code only, related binary code retrieval contributes to known vulnerabilities confirmation. However, due to the vast difference between source and binary code, few studies have investigated binary source code matching. Previously published studies focus on code literals extraction such as strings and integers, then utilize traditional matching algorithms such as the Hungarian algorithm for code matching.
Our motivation is to solve the binary source code matching problem, which is very important for 2
We thank the reviewers for their valuable feedback. We will address the comments and the concerns as follows. It can be leveraged for code vulnerability analysis and malware detection. Our research can benefit tens of thousands of reverse engineering researchers. In our paper, the technique is not deep in the modeling part.
CodeCMR: Cross-Modal Retrieval For Function-Level Binary Source Code Matching
Binary source code matching, especially on function-level, has a critical role in the field of computer security. Given binary code only, finding the corresponding source code improves the accuracy and efficiency in reverse engineering. Given source code only, related binary code retrieval contributes to known vulnerabilities confirmation. However, due to the vast difference between source and binary code, few studies have investigated binary source code matching. Previously published studies focus on code literals extraction such as strings and integers, then utilize traditional matching algorithms such as the Hungarian algorithm for code matching.