Learning to translate by learning to communicate
Downey, C. M., Zhou, Xuhui, Liu, Leo Z., Steinert-Threlkeld, Shane
–arXiv.org Artificial Intelligence
We formulate and test a technique to use Emergent Communication (EC) with a pre-trained multilingual model to improve on modern Unsupervised NMT systems, especially for low-resource languages. It has been argued that the current dominant paradigm in NLP of pre-training on text-only corpora will not yield robust natural language understanding systems, and the need for grounded, goal-oriented, and interactive language learning has been high lighted. In our approach, we embed a multilingual model (mBART, Liu et al., 2020) into an EC image-reference game, in which the model is incentivized to use multilingual generations to accomplish a vision-grounded task. The hypothesis is that this will align multiple languages to a shared task space. We present two variants of EC Fine-Tuning (Steinert-Threlkeld et al., 2022), one of which outperforms a backtranslation-only baseline in all four languages investigated, including the low-resource language Nepali.
arXiv.org Artificial Intelligence
Oct-19-2023
- Country:
- Oceania > Australia
- North America
- Dominican Republic (0.04)
- United States
- Texas > Travis County
- Austin (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Texas > Travis County
- Canada > British Columbia
- Europe
- Germany > Berlin (0.04)
- Netherlands (0.04)
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Italy > Tuscany
- Florence (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Education (0.48)
- Technology: