Few-shot training LLMs for project-specific code-summarization

Sep-8-2022–arXiv.org Artificial Intelligence

Very large language models (LLMs), such as GPT-3 and Codex have achieved state-of-the-art performance on several natural-language tasks, and show great promise also for code. A particularly exciting aspect of LLMs is their knack for few-shot and zero-shot learning: they can learn to perform a task with very few examples. Few-shotting has particular synergies in software engineering, where there are a lot of phenomena (identifier names, APIs, terminology, coding patterns) that are known to be highly project-specific. However, project-specific data can be quite limited, especially early in the history of a project; thus the few-shot learning capacity of LLMs might be very relevant. In this paper, we investigate the use few-shot training with the very large GPT (Generative Pre-trained Transformer) Codex model, and find evidence suggesting that one can significantly surpass state-of-the-art models for code-summarization, leveraging project-specific training.

code summarization, codex, few-shot training, (11 more...)

arXiv.org Artificial Intelligence

Sep-8-2022

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California > Yolo County
    - Davis (0.14)
  - Michigan > Oakland County
    - Rochester (0.05)
  - New York > New York County
    - New York City (0.04)

Genre:
- Research Report > Experimental Study (0.47)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.89)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found