Extrapolating Multilingual Understanding Models as Multilingual Generators
Wu, Bohong, Yuan, Fei, Zhao, Hai, Li, Lei, Xu, Jingjing
–arXiv.org Artificial Intelligence
Multilingual understanding models (or encoder-based), pre-trained via masked language modeling, have achieved promising results on many language understanding tasks (e.g., mBERT). However, these non-autoregressive (NAR) models still struggle to generate high-quality texts compared with autoregressive (AR) models. Considering that encoder-based models have the advantage of efficient generation and self-correction abilities, this paper explores methods to empower multilingual understanding models the generation abilities to get a unified model. Specifically, we start from a multilingual encoder (XLM-R) and propose a \textbf{S}emantic-\textbf{G}uided \textbf{A}lignment-then-Denoising (SGA) approach to adapt an encoder to a multilingual generator with a small number of new parameters. Experiments show that the proposed approach is an effective adaption method, outperforming widely-used initialization-based methods with gains of 9.4 BLEU on machine translation, 8.1 Rouge-L on question generation, and 5.5 METEOR on story generation on XLM-R$_{large}$. On the other hand, we observe that XLM-R is still inferior to mBART in supervised settings despite better results on zero-shot settings, indicating that more exploration is required to make understanding models strong generators.
arXiv.org Artificial Intelligence
May-22-2023
- Country:
- North America
- United States
- Pennsylvania (0.04)
- Michigan (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California
- Santa Barbara County > Santa Barbara (0.04)
- Los Angeles County > Long Beach (0.04)
- Canada > British Columbia
- United States
- Europe
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Italy > Tuscany
- Florence (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Spain > Catalonia
- Asia > China
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- North America
- Genre:
- Research Report (1.00)
- Technology:
- Information Technology > Artificial Intelligence > Natural Language
- Machine Translation (0.52)
- Large Language Model (0.36)
- Chatbot (0.34)
- Information Technology > Artificial Intelligence > Natural Language