Uncovering inequalities in new knowledge learning by large language models across different languages

Wang, Chenglong, Tang, Haoyu, Yang, Xiyuan, Xie, Yueqi, Suh, Jina, Sitaram, Sunayana, Huang, Junming, Xie, Yu, Gong, Zhaoya, Xie, Xing, Wu, Fangzhao

arXiv.org Artificial Intelligence 

Existing research has primarily focused on static analyses that assess the disparities in the existing knowledge and capabilities of LLMs across languages. However, LLMs are continuously evolving, acquiring new knowledge to generate up-to-date, domain-specific responses. Investigating linguistic inequalities within this dynamic process is, therefore, also essential. In this paper, we explore inequalities in new knowledge learning by LLMs across different languages and four key dimensions: effectiveness, transferability, prioritization, and robustness. Through extensive experiments under two settings (in-context learning and fine-tuning) using both proprietary and open-source models, we demonstrate that low-resource languages consistently face disadvantages across all four dimensions. By shedding light on these disparities, we aim to raise awareness of linguistic inequities in LLMs' new knowledge learning, fostering the development of more inclusive and equitable future LLMs. This transformation is both inevitable and global in scale. One notable example is ChatGPT, which, as of December 2024, serves 300 million weekly active users worldwide (6, 7). Given such widespread adoption, it is crucial to study fairness in multilingual environments to ensure that users of different languages can benefit equally from these systems (9). Existing research on multilingual equality in LLMs primarily focuses on static analyses that evaluate disparities in the knowledge and capabilities of LLMs across different languages (10, 11, 12, 13, 14, 15, 16, 17). Some studies, for example, have examined the amount of factual knowledge encoded in different languages and revealed significant variations. In particular, they reveal that knowledge available in low-resource languages remains limited due to the lack of pre-training data in these languages (18, 19, 20). These studies have significantly advanced our understanding of the extent and nature of multilingual inequalities in LLMs' existing knowledge and capabilities. However, we still lack an understanding of inequalities in the process of acquiring new knowledge, an evolving perspective in research on LLMs. Learning new knowledge is crucial for LLMs, as illustrated in Figure 1a. On the one hand, general-purpose LLMs are pre-trained on static datasets that were collected prior to training and may not include real-time or recent information. As a result, these models do not possess new knowledge, and their knowledge base can quickly become outdated.