Winning with Less for Low Resource Languages: Advantage of Cross-Lingual English_Persian Argument Mining Model over LLM Augmentation
Jahan, Ali, Ghayoomi, Masood, Hautli-Janisz, Annette
–arXiv.org Artificial Intelligence
Argument mining is a subfield of natural language processing to identify and extract the argument components, like premises and conclusions, within a text and to recognize the relations between them. It reveals the logical structure of texts to be used in tasks like knowledge extraction. This paper aims at utilizing a cross-lingual approach to argument mining for low-resource languages, by constructing three training scenarios. We examine the models on English, as a high-resource language, and Persian, as a low-resource language. To this end, we evaluate the models based on the English Microtext corpus \citep{PeldszusStede2015}, and its parallel Persian translation. The learning scenarios are as follow: (i) zero-shot transfer, where the model is trained solely with the English data, (ii) English-only training enhanced by synthetic examples generated by Large Language Models (LLMs), and (iii) a cross-lingual model that combines the original English data with manually translated Persian sentences. The zero-shot transfer model attains F1 scores of 50.2\% on the English test set and 50.7\% on the Persian test set. LLM-based augmentation model improves the performance up to 59.2\% on English and 69.3\% on Persian. The cross-lingual model, trained on both languages but evaluated solely on the Persian test set, surpasses the LLM-based variant, by achieving a F1 of 74.8\%. Results indicate that a lightweight cross-lingual blend can outperform considerably the more resource-intensive augmentation pipelines, and it offers a practical pathway for the argument mining task to overcome data resource shortage on low-resource languages.
arXiv.org Artificial Intelligence
Nov-27-2025
- Country:
- Asia
- Middle East > Iran
- Tehran Province > Tehran (0.04)
- Singapore (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Middle East > Iran
- Europe
- Denmark > Capital Region
- Copenhagen (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Germany > Berlin (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Italy > Tuscany
- Florence (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Spain
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Oxfordshire > Oxford (0.04)
- Denmark > Capital Region
- North America
- Canada
- Alberta > Census Division No. 15
- Improvement District No. 9 > Banff (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Ontario > Toronto (0.04)
- Alberta > Census Division No. 15
- United States
- Colorado > Denver County
- Denver (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- New Jersey > Bergen County
- Mahwah (0.04)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- New York > New York County
- New York City (0.04)
- Colorado > Denver County
- Canada
- Oceania > Palau (0.04)
- South America > Chile
- Asia
- Genre:
- Research Report
- Experimental Study (0.67)
- New Finding (0.68)
- Research Report
- Industry:
- Education (0.34)
- Technology: