Assessing the Answerability of Queries in Retrieval-Augmented Code Generation

Kim, Geonmin, Kim, Jaeyeon, Park, Hancheol, Shin, Wooksu, Kim, Tae-Ho

Nov-25-2024–arXiv.org Artificial Intelligence

Thanks to unprecedented language understanding and generation capabilities of large language model (LLM), Retrieval-augmented Code Generation (RaCG) has recently been widely utilized among software developers. While this has increased productivity, there are still frequent instances of incorrect codes being provided. In particular, there are cases where plausible yet incorrect codes are generated for queries from users that cannot be answered with the given queries and API descriptions. This study proposes a task for evaluating answerability, which assesses whether valid answers can be generated based on users' queries and retrieved APIs in RaCG. Additionally, we build a benchmark dataset called Retrieval-augmented Code Generability Evaluation (RaCGEval) to evaluate the performance of models performing this task. Experimental results show that this task remains at a very challenging level, with baseline models exhibiting a low performance of 46.7%. Furthermore, this study discusses methods that could significantly improve performance. Figure 1: An example of an LLM generating plausible code even when the request is made outside the functionality provided by the library.

dataset, information, query, (17 more...)

arXiv.org Artificial Intelligence

Nov-25-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - Mexico > Mexico City
    - Mexico City (0.04)
- Europe > Middle East
  - Malta (0.04)
- Asia
  - Singapore (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre:
- Research Report > New Finding (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.95)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found