An Extractive-and-Abstractive Framework for Source Code Summarization
Sun, Weisong, Fang, Chunrong, Chen, Yuchen, Zhang, Quanjun, Tao, Guanhong, Han, Tingxu, Ge, Yifei, You, Yudu, Luo, Bin
–arXiv.org Artificial Intelligence
(Source) Code summarization aims to automatically generate summaries/comments for a given code snippet in the form of natural language. Such summaries play a key role in helping developers understand and maintain source code. Existing code summarization techniques can be categorized into extractive methods and abstractive methods. The extractive methods extract a subset of important statements and keywords from the code snippet using retrieval techniques, and generate a summary that preserves factual details in important statements and keywords. However, such a subset may miss identifier or entity naming, and consequently, the naturalness of generated summary is usually poor. The abstractive methods can generate human-written-like summaries leveraging encoder-decoder models from the neural machine translation domain. The generated summaries however often miss important factual details. To generate human-written-like summaries with preserved factual details, we propose a novel extractive-and-abstractive framework. The extractive module in the framework performs a task of extractive code summarization, which takes in the code snippet and predicts important statements containing key factual details. The abstractive module in the framework performs a task of abstractive code summarization, which takes in the entire code snippet and important statements in parallel and generates a succinct and human-written-like natural language summary. We evaluate the effectiveness of our technique, called EACS, by conducting extensive experiments on three datasets involving six programming languages. Experimental results show that EACS significantly outperforms state-of-the-art techniques in terms of all three widely used metrics, including BLEU, METEOR, and ROUGH-L.
arXiv.org Artificial Intelligence
Nov-4-2023
- Country:
- Africa > South Africa
- Western Cape > Cape Town (0.04)
- Asia
- China
- Beijing > Beijing (0.04)
- Hong Kong (0.04)
- Jiangsu Province > Nanjing (0.05)
- India > Telangana
- Hyderabad (0.04)
- Middle East > Qatar
- South Korea > Seoul
- Seoul (0.04)
- Taiwan > Taiwan Province
- Taipei (0.04)
- China
- Europe
- Hungary (0.04)
- Czechia (0.04)
- France > Occitanie
- Hérault > Montpellier (0.04)
- Sweden
- Stockholm > Stockholm (0.04)
- Vaestra Goetaland > Gothenburg (0.04)
- Belgium > Flanders
- Antwerp Province > Antwerp (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Greece > Attica
- Athens (0.04)
- United Kingdom > England
- Greater London > London (0.04)
- West Midlands > Coventry (0.04)
- Spain
- Catalonia > Barcelona Province
- Barcelona (0.04)
- Galicia > Madrid (0.04)
- Catalonia > Barcelona Province
- Germany > Berlin (0.04)
- North America
- Canada
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.14)
- Ontario > Toronto (0.04)
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- Dominican Republic (0.04)
- United States
- California
- Los Angeles County > Long Beach (0.04)
- San Diego County > San Diego (0.04)
- San Francisco County > San Francisco (0.28)
- Pennsylvania
- Allegheny County > Pittsburgh (0.04)
- Philadelphia County > Philadelphia (0.04)
- Oregon > Benton County
- Corvallis (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Maryland > Montgomery County
- Gaithersburg (0.04)
- Indiana > Tippecanoe County
- Lafayette (0.04)
- West Lafayette (0.04)
- Colorado > Denver County
- Denver (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- Massachusetts > Essex County
- Beverly (0.04)
- California
- Canada
- Oceania > Australia
- Africa > South Africa
- Genre:
- Research Report > New Finding (0.67)
- Industry:
- Information Technology (0.45)
- Technology: