A study of the impact of generative AI-based data augmentation on software metadata classification

Kumari, Tripti, Charan, Chakali Sai, Das, Ayan

Oct-14-2023–arXiv.org Artificial Intelligence

This paper presents the system submitted by the team from IIT(ISM) Dhanbad in FIRE IRSE 2023 shared task 1 on the automatic usefulness prediction of code-comment pairs as well as the impact of Large Language Model(LLM) generated data on original base data towards an associated source code. We have developed a framework where we train a machine learning-based model using the neural contextual representations of the comments and their corresponding codes to predict the usefulness of code-comments pair and performance analysis with LLM-generated data with base data. In the official assessment, our system achieves a 4% increase in F1-score from baseline and the quality of generated data.

generative ai-based data augmentation, software metadata classification

arXiv.org Artificial Intelligence

Oct-14-2023

arXiv.org PDF

Add feedback

Country:
- Asia > India > Jharkhand > Dhanbad (0.24)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.73)
  - Machine Learning > Neural Networks
    - Deep Learning > Generative AI (0.40)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found