Semantic Pivots Enable Cross-Lingual Transfer in Large Language Models
He, Kaiyu, Zhou, Tong, Chen, Yubo, Qiu, Delai, Liu, Shengping, Liu, Kang, Zhao, Jun
–arXiv.org Artificial Intelligence
Large language models (LLMs) demonstrate remarkable ability in cross-lingual tasks. Understanding how LLMs acquire this ability is crucial for their interpretability. To quantify the cross-lingual ability of LLMs accurately, we propose a Word-Level Cross-Lingual Translation Task. To find how LLMs learn cross-lingual ability, we trace the outputs of LLMs' intermediate layers in the word translation task. We identify and distinguish two distinct behaviors in the forward pass of LLMs: co-occurrence behavior and semantic pivot behavior. We attribute LLMs' two distinct behaviors to the co-occurrence frequency of words and find the semantic pivot from the pre-training dataset. Finally, to apply our findings to improve the cross-lingual ability of LLMs, we reconstruct a semantic pivot-aware pre-training dataset using documents with a high proportion of semantic pivots. Our experiments validate the effectiveness of our approach in enhancing cross-lingual ability. Our research contributes insights into the interpretability of LLMs and offers a method for improving LLMs' cross-lingual ability.
arXiv.org Artificial Intelligence
May-23-2025
- Country:
- Asia (0.93)
- North America > United States (0.28)
- Genre:
- Research Report > New Finding (0.48)
- Technology: