MiLiC-Eval: Benchmarking Multilingual LLMs for China's Minority Languages
Zhang, Chen, Tao, Mingxu, Liao, Zhiyuan, Feng, Yansong
–arXiv.org Artificial Intelligence
Large language models (LLMs) excel in high-resource languages but struggle with low-resource languages (LRLs), particularly those spoken by minority communities in China, such as Tibetan, Uyghur, Kazakh, and Mongolian. To systematically track the progress in these languages, we introduce MiLiC-Eval, a benchmark designed for minority languages in China, featuring 24K instances across 9 tasks. MiLiC-Eval focuses on underrepresented writing systems and provides a fine-grained assessment of linguistic and problem-solving skills. Our evaluation reveals that LLMs perform poorly on syntax-intensive tasks and multi-script languages. We further demonstrate how MiLiC-Eval can help advance LRL research in handling diverse writing systems and understanding the process of language adaptation.
arXiv.org Artificial Intelligence
Mar-2-2025
- Country:
- Africa (0.04)
- Asia
- China
- Inner Mongolia (0.04)
- Jiangsu Province > Nanjing (0.04)
- India (0.04)
- Indonesia > Bali (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.14)
- Mongolia (0.04)
- Singapore (0.05)
- Southeast Asia (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- China
- Europe > Middle East
- Cyprus > Nicosia
- Nicosia (0.04)
- Malta > Eastern Region
- Northern Harbour District > St. Julian's (0.04)
- Cyprus > Nicosia
- North America
- Canada > Ontario
- Toronto (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- California > Santa Clara County
- Palo Alto (0.04)
- Florida > Miami-Dade County
- Miami (0.04)
- California > Santa Clara County
- Canada > Ontario
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Education (0.68)
- Technology: