SCALE: Synergized Collaboration of Asymmetric Language Translation Engines

Cheng, Xin, Wang, Xun, Ge, Tao, Chen, Si-Qing, Wei, Furu, Zhao, Dongyan, Yan, Rui

Sep-29-2023–arXiv.org Artificial Intelligence

In this paper, we introduce SCALE, a collaborative framework that connects compact Specialized Translation Models (STMs) and general-purpose Large Language Models (LLMs) as one unified translation engine. By introducing translation from STM into the triplet in-context demonstrations, SCALE unlocks refinement and pivoting ability of LLM, thus mitigating language bias of LLM and parallel data bias of STM, enhancing LLM speciality without sacrificing generality, and facilitating continual learning without expensive LLM fine-tuning. Our comprehensive experiments show that SCALE significantly outperforms both few-shot LLMs (GPT-4) and specialized models (NLLB) in challenging low-resource settings. Moreover, in Xhosa to English translation, SCALE experiences consistent improvement by a 4 BLEURT score without tuning LLM and surpasses few-shot GPT-4 by 2.5 COMET score and 3.8 BLEURT score when equipped with a compact model consisting of merely 600M parameters. SCALE could also effectively exploit the existing language bias of LLMs by using an English-centric STM as a pivot for translation between any language pairs, outperforming few-shot GPT-4 by an average of 6 COMET points across eight translation directions. Furthermore we provide an in-depth analysis of SCALE's robustness, translation characteristics, and latency costs, providing solid foundation for future studies exploring the potential synergy between LLMs and more specialized, task-specific models Large Language Models (LLMs) have recently revolutionized the field of natural language processing (OpenAI, 2023; Touvron et al., 2023; Peng et al., 2023), significantly influencing machine translation (MT) by delivering exceptional performance without requiring a bilingual corpus, particularly in high-resource languages (Brown et al., 2020; Garcia et al., 2023). Moreover, as a unified multi-task learner, LLMs represent a substantial step towards artificial general intelligence (Bubeck et al., 2023), with the potential to overcome not only language barriers but also cultural boundaries simultaneously through a simple "translate and explain" prompt.

large language model, machine learning, translation, (17 more...)

arXiv.org Artificial Intelligence

Sep-29-2023

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East
  - UAE (0.15)
- Europe > Denmark (0.14)
- North America > Canada (0.14)

Genre:
- Research Report > New Finding (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found