TARDiS : Text Augmentation for Refining Diversity and Separability

Kim, Kyungmin, Im, SangHun, Kim, GiBaeg, Oh, Heung-Seon

Jan-5-2025–arXiv.org Artificial Intelligence

Text augmentation (TA) is a critical technique for text classification, especially in few-shot settings. This paper introduces a novel LLM-based TA method, TARDiS, to address challenges inherent in the generation and alignment stages of two-stage TA methods. For the generation stage, we propose two generation processes, SEG and CEG, incorporating multiple class-specific prompts to enhance diversity and separability. For the alignment stage, we introduce a class adaptation (CA) method to ensure that generated examples align with their target classes through verification and modification. Experimental results demonstrate TARDiS's effectiveness, outperforming state-of-the-art LLM-based TA methods in various few-shot text classification tasks. An in-depth analysis confirms the detailed behaviors at each stage.

large language model, machine learning, seed data, (20 more...)

arXiv.org Artificial Intelligence

Jan-5-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Portugal
  - Faro > Faro (0.04)
- North America
  - Canada > Ontario
    - Toronto (0.04)
  - United States > Texas (0.04)

Genre:
- Research Report (0.70)

Industry:
- Transportation
  - Ground > Road (1.00)
  - Passenger (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.47)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found