Automatic Detection of LLM-generated Code: A Case Study of Claude 3 Haiku
Rahman, Musfiqur, Khatoonabadi, SayedHassan, Abdellatif, Ahmad, Shihab, Emad
–arXiv.org Artificial Intelligence
Using Large Language Models (LLMs) has gained popularity among software developers for generating source code. However, the use of LLM-generated code can introduce risks of adding suboptimal, defective, and vulnerable code. This makes it necessary to devise methods for the accurate detection of LLM-generated code. Toward this goal, we perform a case study of Claude 3 Haiku (or Claude 3 for brevity) on CodeSearchNet dataset. We divide our analyses into two parts: function-level and class-level. We extract 22 software metric features, such as Code Lines and Cyclomatic Complexity, for each level of granularity. We then analyze code snippets generated by Claude 3 and their human-authored counterparts using the extracted features to understand how unique the code generated by Claude 3 is. In the following step, we use the unique characteristics of Claude 3-generated code to build Machine Learning (ML) models and identify which features of the code snippets make them more detectable by ML models. Our results indicate that Claude 3 tends to generate longer functions, but shorter classes than humans, and this characteristic can be used to detect Claude 3-generated code with ML models with 82% and 66% accuracies for function-level and class-level snippets, respectively.
arXiv.org Artificial Intelligence
Sep-2-2024
- Country:
- North America > Canada (0.47)
- Genre:
- Research Report
- Experimental Study (0.93)
- New Finding (1.00)
- Research Report
- Industry:
- Information Technology > Software (0.34)
- Law (0.46)
- Technology: