LLMs Are Few-Shot In-Context Low-Resource Language Learners
Cahyawijaya, Samuel, Lovenia, Holy, Fung, Pascale
–arXiv.org Artificial Intelligence
In-context learning (ICL) empowers large language models (LLMs) to perform diverse tasks in underrepresented languages using only short in-context information, offering a crucial avenue for narrowing the gap between high-resource and low-resource languages. Nonetheless, there is only a handful of works explored ICL for low-resource languages with most of them focusing on relatively high-resource languages, such as French and Spanish. In this work, we extensively study ICL and its cross-lingual variation (X-ICL) on 25 low-resource and 7 relatively higher-resource languages. Our study not only assesses the effectiveness of ICL with LLMs in low-resource languages but also identifies the shortcomings of in-context label alignment, and introduces a more effective alternative: query alignment. Moreover, we provide valuable insights into various facets of ICL for low-resource languages. Our study concludes the significance of few-shot in-context information on enhancing the low-resource understanding quality of LLMs through semantically relevant information by closing the language gap in the target language and aligning the semantics between the targeted low-resource and the high-resource language that the model is proficient in. Our work highlights the importance of advancing ICL research, particularly for low-resource languages. Our code is publicly released at https://github.com/SamuelCahyawijaya/in-context-alignment
arXiv.org Artificial Intelligence
Jun-25-2024
- Country:
- South America (0.05)
- North America
- Dominican Republic (0.04)
- United States
- Washington > King County
- Seattle (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.04)
- California > San Francisco County
- San Francisco (0.14)
- Washington > King County
- Canada > Ontario
- Toronto (0.04)
- Europe
- Belgium (0.04)
- Sweden > Uppsala County
- Uppsala (0.04)
- Italy
- Tuscany > Florence (0.04)
- Calabria > Catanzaro Province
- Catanzaro (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Albania > Tirana County
- Tirana (0.04)
- Asia
- East Asia (0.04)
- Singapore (0.04)
- Indonesia (0.04)
- China > Hong Kong (0.04)
- Central Asia (0.04)
- Southeast Asia (0.04)
- Middle East
- Israel (0.04)
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Japan > Honshū
- Chūbu > Toyama Prefecture > Toyama (0.04)
- Africa
- Niger (0.04)
- North Africa (0.04)
- Genre:
- Research Report > New Finding (0.93)
- Technology: