Large Language Models for Code Summarization

Szalontai, Balázs, Szalay, Gergő, Márton, Tamás, Sike, Anna, Pintér, Balázs, Gregorics, Tibor

May-29-2024–arXiv.org Artificial Intelligence

The introduction of Encoder-Decoder architectures in natural language processing [26] (both recurrent [6] and Transformer-based [29]) has motivated researchers to apply them to software engineering. One important application is generating summaries of code [25, 2, 11]. A code summarization tool is useful for example to understand legacy code or to create documentation. Since the spread of Large Language Models (LLMs), the working programmer has many more opportunities to use deep learning-based tools. Closed models (such as GPT-4 [21] or Gemini [27]) and open models (such as CodeLlama [24] or WizardCoder [19]) demonstrate impressive capabilities of generating source code based on a task description, as well as generating natural-language summary of code. The main objective of this technical report is to investigate how well open-sourced LLMs handle source code in relation with natural language text. In particular, we discuss results of some of the most acknowledged open-source LLMs, focusing on their code summarization/explanation (code-to-text) capabilities. We also discuss code generation (text-to-code) capabilities of these LLMs, as this is often considered to be their most defining capability. That is, LLMs are often ranked simply based on results on a code generation benchmark.

benchmark, language model, pass, (13 more...)

arXiv.org Artificial Intelligence

May-29-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- Europe
  - Germany > Berlin (0.04)
  - Switzerland > Geneva
    - Geneva (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found