A study of the impact of generative AI-based data augmentation on software metadata classification
Kumari, Tripti, Charan, Chakali Sai, Das, Ayan
–arXiv.org Artificial Intelligence
This paper presents the system submitted by the team from IIT(ISM) Dhanbad in FIRE IRSE 2023 shared task 1 on the automatic usefulness prediction of code-comment pairs as well as the impact of Large Language Model(LLM) generated data on original base data towards an associated source code. We have developed a framework where we train a machine learning-based model using the neural contextual representations of the comments and their corresponding codes to predict the usefulness of code-comments pair and performance analysis with LLM-generated data with base data. In the official assessment, our system achieves a 4% increase in F1-score from baseline and the quality of generated data.
arXiv.org Artificial Intelligence
Oct-14-2023