Enchancing Semi-Supervised Learning for Extractive Summarization with an LLM-based pseudolabeler
Sahu, Gaurav, Vechtomova, Olga, Laradji, Issam H.
–arXiv.org Artificial Intelligence
This work tackles the task of extractive text summarization in a limited labeled data scenario using a semi-supervised approach. Specifically, we propose a prompt-based pseudolabel selection strategy using GPT-4. We evaluate our method on three text summarization datasets: TweetSumm, WikiHow, and ArXiv/PubMed. Our experiments show that by using an LLM to evaluate and generate pseudolabels, we can improve the ROUGE-1 by 10-20\% on the different datasets, which is akin to enhancing pretrained models. We also show that such a method needs a smaller pool of unlabeled examples to perform better.
arXiv.org Artificial Intelligence
Nov-15-2023
- Country:
- Asia > China (0.14)
- Europe
- North America > United States
- Louisiana (0.14)
- Genre:
- Research Report (0.40)
- Technology: