Enhancing Low-Resource NMT with a Multilingual Encoder and Knowledge Distillation: A Case Study

Roy, Aniruddha, Ray, Pretam, Maheshwari, Ayush, Sarkar, Sudeshna, Goyal, Pawan

Jul-9-2024–arXiv.org Artificial Intelligence

Neural Machine Translation (NMT) remains a formidable challenge, especially when dealing with low-resource languages. Pre-trained sequence-to-sequence (seq2seq) multi-lingual models, such as mBART-50, have demonstrated impressive performance in various low-resource NMT tasks. However, their pre-training has been confined to 50 languages, leaving out support for numerous low-resource languages, particularly those spoken in the Indian subcontinent. Expanding mBART-50's language support requires complex pre-training, risking performance decline due to catastrophic forgetting. Considering these expanding challenges, this paper explores a framework that leverages the benefits of a pre-trained language model along with knowledge distillation in a seq2seq architecture to facilitate translation for low-resource languages, including those not covered by mBART-50. The proposed framework employs a multilingual encoder-based seq2seq model as the foundational architecture and subsequently uses complementary knowledge distillation techniques to mitigate the impact of imbalanced training. Our framework is evaluated on three low-resource Indic languages in four Indic-to-Indic directions, yielding significant BLEU-4 and chrF improvements over baselines. Further, we conduct human evaluation to confirm effectiveness of our approach. Our code is publicly available at https://github.com/raypretam/Two-step-low-res-NMT.

computational linguistic, machine translation, translation, (9 more...)

arXiv.org Artificial Intelligence

Jul-9-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Pennsylvania (0.04)
  - Dominican Republic (0.04)
- Europe
  - Portugal > Lisbon
    - Lisbon (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
- Asia > India
  - West Bengal > Kharagpur (0.04)
  - Karnataka > Bengaluru (0.04)

Genre:
- Research Report (1.00)

Industry:
- Education (0.47)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Machine Translation (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found