Zero-shot Cross-lingual Transfer without Parallel Corpus

Zhang, Yuyang, Han, Xiaofeng, Wang, Baojun

Oct-7-2023–arXiv.org Artificial Intelligence

Recently, although pre-trained language models have achieved great success on multilingual NLP (Natural Language Processing) tasks, the lack of training data on many tasks in low-resource languages still limits their performance. One effective way of solving that problem is to transfer knowledge from rich-resource languages to low-resource languages. However, many previous works on cross-lingual transfer rely heavily on the parallel corpus or translation models, which are often difficult to obtain. We propose a novel approach to conduct zero-shot cross-lingual transfer with a pre-trained model. It consists of a Bilingual Task Fitting module that applies task-related bilingual information alignment; a self-training module generates pseudo soft and hard labels for unlabeled data and utilizes them to conduct self-training. We got the new SOTA on different tasks without any dependencies on the parallel corpus or translation models.

cross-lingual transfer, dataset, target language, (16 more...)

arXiv.org Artificial Intelligence

Oct-7-2023

arXiv.org PDF

Add feedback

Country:
- North America > Canada (0.04)
- Europe
  - Germany (0.04)
  - France (0.04)
- Asia
  - China (0.04)
  - Taiwan > Taiwan Province
    - Taipei (0.04)

Genre:
- Instructional Material (0.55)
- Research Report > Promising Solution (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language
    - Large Language Model (0.72)
    - Information Retrieval (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found