Does Synthetic Data Help Named Entity Recognition for Low-Resource Languages?

Nov-6-2025–arXiv.org Artificial Intelligence

Named Entity Recognition(NER) for low-resource languages aims to produce robust systems for languages where there is limited labeled training data available, and has been an area of increasing interest within NLP. Data augmentation for increasing the amount of low-resource labeled data is a common practice. In this paper, we explore the role of synthetic data in the context of multilingual, low-resource NER, considering 11 languages from diverse language families. Our results suggest that synthetic data does in fact hold promise for low-resource language NER, though we see significant variation between languages.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Nov-6-2025

arXiv.org PDF

Add feedback

Country:
- Europe (1.00)
- North America
  - Canada (0.70)
  - Mexico > Mexico City (0.14)
- Asia > Middle East
  - UAE (0.15)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Government (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Text Processing (1.00)
    - Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.33)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found