Improving In-context Learning of Multilingual Generative Language Models with Cross-lingual Alignment

Li, Chong, Wang, Shaonan, Zhang, Jiajun, Zong, Chengqing

Jun-12-2024–arXiv.org Artificial Intelligence

Multilingual generative models obtain remarkable cross-lingual in-context learning capabilities through pre-training on large-scale corpora. However, they still exhibit a performance bias toward high-resource languages and learn isolated distributions of multilingual sentence representations, which may hinder knowledge transfer across languages. To bridge this gap, we propose a simple yet effective cross-lingual alignment framework exploiting pairs of translation sentences. It aligns the internal sentence representations across different languages via multilingual contrastive learning and aligns outputs by following cross-lingual instructions in the target language. Experimental results show that even with less than 0.1 {\textperthousand} of pre-training tokens, our alignment framework significantly boosts the cross-lingual abilities of generative language models and mitigates the performance gap. Further analyses reveal that it results in a better internal multilingual representation distribution of multilingual models.

computational linguistic, instruction, representation, (15 more...)

arXiv.org Artificial Intelligence

Jun-12-2024

arXiv.org PDF

Add feedback

Country:
- Africa > Niger (0.04)
- North America
  - Dominican Republic (0.04)
  - United States > New York
    - New York County > New York City (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Belgium (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
- Asia
  - Singapore (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)
  - China
    - Beijing > Beijing (0.04)
    - Hong Kong (0.04)

Genre:
- Research Report > New Finding (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Machine Translation (1.00)
    - Large Language Model (1.00)
    - Generation (0.67)
  - Machine Learning > Neural Networks
    - Deep Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found