VECO 2.0: Cross-lingual Language Model Pre-training with Multi-granularity Contrastive Learning

Zhang, Zhen-Ru, Tan, Chuanqi, Huang, Songfang, Huang, Fei

Apr-17-2023–arXiv.org Artificial Intelligence

Recent studies have demonstrated the potential of cross-lingual transferability by training a unified Transformer encoder for multiple languages. In addition to involving the masked language model objective, existing cross-lingual pre-training works leverage sentence-level contrastive learning or plugs in extra cross-attention module to complement the insufficient capabilities of cross-lingual alignment. Nonetheless, synonym pairs residing in bilingual corpus are not exploited and aligned, which is more crucial than sentence interdependence establishment for token-level tasks. In this work, we propose a cross-lingual pre-trained model VECO~2.0 based on contrastive learning with multi-granularity alignments. Specifically, the sequence-to-sequence alignment is induced to maximize the similarity of the parallel pairs and minimize the non-parallel pairs. Then, token-to-token alignment is integrated to bridge the gap between synonymous tokens excavated via the thesaurus dictionary from the other unpaired tokens in a bilingual instance. Experiments show the effectiveness of the proposed strategy for cross-lingual model pre-training on the XTREME benchmark.

computational linguistic, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Apr-17-2023

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.04)
- Europe
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)
  - Germany > Berlin (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
- North America
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
  - Dominican Republic (0.04)
  - United States > Minnesota
    - Hennepin County > Minneapolis (0.14)

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language
    - Machine Translation (0.68)
    - Text Processing (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found