Serial or Parallel? Plug-able Adapter for multilingual machine translation
Zhu, Yaoming, Feng, Jiangtao, Zhao, Chengqi, Wang, Mingxuan, Li, Lei
–arXiv.org Artificial Intelligence
Developing a unified multilingual translation model is a key topic in machine translation research. However, existing approaches suffer from performance degradation: multilingual models yield inferior performance compared to the ones trained separately on rich bilingual data. We attribute the performance degradation to two issues: multilingual embedding conflation and multilingual fusion effects. To address the two issues, we propose PAM, a Transformer model augmented with defusion adaptation for multilingual machine translation. Specifically, PAM consists of embedding and layer adapters to shift the word and intermediate representations towards language-specific ones. Extensive experiment results on IWSLT, OPUS-100, and WMT benchmarks show that \method outperforms several strong competitors, including series adapter and multilingual knowledge distillation.
arXiv.org Artificial Intelligence
Apr-16-2021
- Country:
- North America > United States
- California > San Diego County > San Diego (0.04)
- Europe > Romania
- North America > United States
- Genre:
- Research Report (0.64)
- Technology: