The Zeno's Paradox of `Low-Resource' Languages
Nigatu, Hellina Hailu, Tonja, Atnafu Lambebo, Rosman, Benjamin, Solorio, Thamar, Choudhury, Monojit
–arXiv.org Artificial Intelligence
The disparity in the languages commonly studied in Natural Language Processing (NLP) is typically reflected by referring to languages as low vs high-resourced. However, there is limited consensus on what exactly qualifies as a `low-resource language.' To understand how NLP papers define and study `low resource' languages, we qualitatively analyzed 150 papers from the ACL Anthology and popular speech-processing conferences that mention the keyword `low-resource.' Based on our analysis, we show how several interacting axes contribute to `low-resourcedness' of a language and why that makes it difficult to track progress for each individual language. We hope our work (1) elicits explicit definitions of the terminology when it is used in papers and (2) provides grounding for the different axes to consider when connoting a language as low-resource.
arXiv.org Artificial Intelligence
Oct-28-2024
- Country:
- Africa
- East Africa (0.04)
- Kenya (0.14)
- Niger (0.04)
- South Africa (0.04)
- Asia
- India (0.04)
- Indonesia > Bali (0.04)
- Middle East
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.05)
- Singapore (0.04)
- Europe
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.06)
- Ireland > Leinster
- County Dublin > Dublin (0.05)
- Italy > Tuscany
- Florence (0.04)
- Spain
- Catalonia > Barcelona Province
- Barcelona (0.05)
- Valencian Community > Valencia Province
- Valencia (0.04)
- Catalonia > Barcelona Province
- United Kingdom (0.04)
- Croatia > Dubrovnik-Neretva County
- North America
- Canada > Ontario
- Toronto (0.05)
- Dominican Republic (0.04)
- Greenland (0.04)
- Mexico (0.04)
- United States
- New York > New York County
- New York City (0.04)
- Virginia (0.04)
- Washington > King County
- Seattle (0.04)
- New York > New York County
- Canada > Ontario
- Oceania > Australia (0.04)
- South America > Peru (0.04)
- Africa
- Genre:
- Research Report (0.82)
- Industry:
- Education (0.93)
- Technology: