FuxiMT: Sparsifying Large Language Models for Chinese-Centric Multilingual Machine Translation
Zhu, Shaolin, Dong, Tianyu, Li, Bo, Xiong, Deyi
–arXiv.org Artificial Intelligence
In this paper, we present FuxiMT, a novel Chinese-centric multilingual machine translation model powered by a sparsified large language model (LLM). We adopt a two-stage strategy to train FuxiMT. We first pre-train the model on a massive Chinese corpus and then conduct multilingual fine-tuning on a large parallel dataset encompassing 65 languages. FuxiMT incorporates Mixture-of-Experts (MoEs) and employs a curriculum learning strategy for robust performance across various resource levels. Experimental results demonstrate that FuxiMT significantly outperforms strong baselines, including state-of-the-art LLMs and machine translation models, particularly under low-resource scenarios. Furthermore, FuxiMT exhibits remarkable zero-shot translation capabilities for unseen language pairs, indicating its potential to bridge communication gaps where parallel data are scarce or unavailable.
arXiv.org Artificial Intelligence
May-21-2025
- Country:
- Africa
- Madagascar (0.04)
- Niger (0.04)
- Asia > China
- Europe
- Austria > Vienna (0.14)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Finland > Uusimaa
- Helsinki (0.04)
- France (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Italy (0.04)
- Romania > Sud - Muntenia Development Region
- Giurgiu County > Giurgiu (0.04)
- North America
- Dominican Republic (0.04)
- United States
- California
- Los Angeles County > Long Beach (0.04)
- San Diego County > San Diego (0.04)
- Maryland > Baltimore (0.04)
- California
- Africa
- Genre:
- Research Report
- New Finding (0.48)
- Promising Solution (0.46)
- Research Report
- Technology: