A Survey on Machine Learning Techniques for Source Code Analysis
Sharma, Tushar, Kechagia, Maria, Georgiou, Stefanos, Tiwari, Rohit, Vats, Indira, Moazen, Hadi, Sarro, Federica
–arXiv.org Artificial Intelligence
The advancements in machine learning techniques have encouraged researchers to apply these techniques to a myriad of software engineering tasks that use source code analysis, such as testing and vulnerability detection. Such a large number of studies hinders the community from understanding the current research landscape. This paper aims to summarize the current knowledge in applied machine learning for source code analysis. We review studies belonging to twelve categories of software engineering tasks and corresponding machine learning techniques, tools, and datasets that have been applied to solve them. To do so, we conducted an extensive literature search and identified 479 primary studies published between 2011 and 2021. We summarize our observations and findings with the help of the identified studies. Our findings suggest that the use of machine learning techniques for source code analysis tasks is consistently increasing. We synthesize commonly used steps and the overall workflow for each task and summarize machine learning techniques employed. We identify a comprehensive list of available datasets and tools useable in this context. Finally, the paper discusses perceived challenges in this area, including the availability of standard datasets, reproducibility and replicability, and hardware resources.
arXiv.org Artificial Intelligence
Sep-13-2022
- Country:
- Asia
- Japan > Kyūshū & Okinawa
- Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)
- Singapore > Central Region
- Singapore (0.04)
- Middle East
- Iran > Tehran Province
- Tehran (0.04)
- Republic of Türkiye > Istanbul Province
- Istanbul (0.04)
- Saudi Arabia > Northern Borders Province
- Arar (0.04)
- Iran > Tehran Province
- Russia (0.04)
- China
- South Korea > Seoul
- Seoul (0.04)
- Taiwan > Taiwan
- Taipei (0.04)
- India
- Karnataka > Bengaluru (0.04)
- Maharashtra > Pune (0.04)
- Tamil Nadu > Chennai (0.04)
- Telangana > Hyderabad (0.04)
- Vietnam > Da Nang
- Da Nang (0.04)
- Japan > Kyūshū & Okinawa
- Europe
- Estonia > Harju County
- Tallinn (0.04)
- Portugal > Braga
- Braga (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Sweden
- Blekinge County > Karlskrona (0.04)
- Stockholm > Stockholm (0.04)
- Vaestra Goetaland > Gothenburg (0.04)
- Västmanland County > Västerås (0.04)
- Russia > Northwestern Federal District
- Leningrad Oblast > Saint Petersburg (0.04)
- Norway > Central Norway
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Italy
- Trentino-Alto Adige/Südtirol > Trentino Province
- Trento (0.04)
- Tuscany > Florence (0.04)
- Trentino-Alto Adige/Südtirol > Trentino Province
- France
- Auvergne-Rhône-Alpes > Lyon
- Lyon (0.04)
- Hauts-de-France > Nord
- Lille (0.04)
- Occitanie > Hérault
- Montpellier (0.04)
- Auvergne-Rhône-Alpes > Lyon
- United Kingdom > England
- Greater London > London (0.04)
- Germany
- Hamburg (0.04)
- Saarland > Saarbrücken (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- Estonia > Harju County
- North America
- Canada
- Alberta > Census Division No. 11
- Edmonton Metropolitan Region > Edmonton (0.04)
- Nova Scotia > Halifax Regional Municipality
- Halifax (0.04)
- Ontario
- Kingston (0.04)
- National Capital Region > Ottawa (0.04)
- Toronto (0.04)
- Quebec > Montreal (0.04)
- Alberta > Census Division No. 11
- United States
- New York > New York County
- New York City (0.04)
- California
- San Diego County > San Diego (0.04)
- San Francisco County > San Francisco (0.13)
- Santa Clara County > San Jose (0.04)
- Alaska > Anchorage Municipality
- Anchorage (0.04)
- Florida > Pinellas County
- St. Petersburg (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- Arizona > Maricopa County
- Colorado > Denver County
- Denver (0.04)
- New Jersey > Middlesex County
- Piscataway (0.04)
- Texas > Travis County
- Austin (0.04)
- New York > New York County
- Canada
- South America
- Argentina > Pampas
- Buenos Aires F.D. > Buenos Aires (0.04)
- Brazil > Sergipe
- Aracaju (0.04)
- Chile > Santiago Metropolitan Region
- Santiago Province > Santiago (0.04)
- Argentina > Pampas
- Asia
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Education > Curriculum
- Subject-Specific Education (1.00)
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (1.00)
- Education > Curriculum
- Technology:
- Information Technology > Artificial Intelligence
- Cognitive Science > Problem Solving (0.67)
- Machine Learning
- Decision Tree Learning (0.94)
- Evolutionary Systems (1.00)
- Learning Graphical Models > Directed Networks
- Bayesian Learning (1.00)
- Neural Networks > Deep Learning (1.00)
- Statistical Learning (1.00)
- Natural Language > Text Processing (1.00)
- Representation & Reasoning
- Expert Systems (0.92)
- Search (0.67)
- Uncertainty
- Bayesian Inference (0.67)
- Fuzzy Logic (0.67)
- Information Technology > Artificial Intelligence