Connecting Ideas in 'Lower-Resource' Scenarios: NLP for National Varieties, Creoles and Other Low-resource Scenarios

Joshi, Aditya, Kanojia, Diptesh, Lent, Heather, Kaing, Hour, Song, Haiyue

arXiv.org Artificial Intelligence 

While each of the lower-resource scenarios bears its unique socio-historical contexts, the tutorial (Selected as a tutorial at COLING 2025) brings together researchers working separately in Despite excellent results on benchmarks these scenarios. Collectively, the tutorial will connect over a small subset of languages, large language past research in terms of: models struggle to process text from Challenges in data curation languages situated in'lower-resource' scenarios Potential for wide linguistic variation (e.g., existing such as dialects/sociolects (national on a linguistic continuum or eschewing or social varieties of a language), Creoles strict spelling conventions, etc.) (languages arising from linguistic contact Need for smart modeling choices over greedy between multiple languages) and other lowresource ones languages. This introductory Increased model vulnerability tutorial will identify common challenges, This introductory tutorial identifies the emergence approaches, and themes in natural language of'lower-resource' scenarios, specifically national processing (NLP) research for confronting varieties, Creoles and other low-resource languages, and overcoming the obstacles inherent and highlights commonalities and differences to data poor contexts.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found