Enhancing Translation for Indigenous Languages: Experiments with Multilingual Models
Tonja, Atnafu Lambebo, Nigatu, Hellina Hailu, Kolesnikova, Olga, Sidorov, Grigori, Gelbukh, Alexander, Kalita, Jugal
–arXiv.org Artificial Intelligence
This paper describes CIC NLP's submission to the AmericasNLP 2023 Shared Task on machine translation systems for indigenous languages of the Americas. We present the system descriptions for three methods. We used two multilingual models, namely M2M-100 and mBART50, and one bilingual (one-to-one) -- Helsinki NLP Spanish-English translation model, and experimented with different transfer learning setups. We experimented with 11 languages from America and report the setups we used as well as the results we achieved. Overall, the mBART setup was able to improve upon the baseline for three out of the eleven languages.
arXiv.org Artificial Intelligence
May-27-2023
- Country:
- South America
- Paraguay (0.14)
- Peru (0.05)
- Brazil > Acre (0.04)
- Bolivia > Potosí Department
- Tomás Frías Province > Potosí (0.04)
- Oceania > Australia
- North America
- Costa Rica (0.04)
- United States
- Wisconsin > Milwaukee County
- Milwaukee (0.04)
- Texas > Harris County
- Houston (0.04)
- New York > New York County
- New York City (0.04)
- Colorado > El Paso County
- Colorado Springs (0.04)
- California > Alameda County
- Berkeley (0.04)
- Wisconsin > Milwaukee County
- Mexico
- Jalisco (0.05)
- Zacatecas (0.04)
- San Luis Potosí (0.04)
- Oaxaca (0.04)
- Europe
- Germany > Berlin (0.04)
- Norway (0.04)
- Russia (0.04)
- Czechia > Prague (0.04)
- Bulgaria > Varna Province
- Varna (0.04)
- Slovenia > Central Slovenia
- Municipality of Ljubljana > Ljubljana (0.04)
- Italy > Tuscany
- Florence (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Finland > Uusimaa
- Helsinki (0.27)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Asia
- Russia (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- China > Yunnan Province
- Kunming (0.04)
- South America
- Genre:
- Research Report (0.64)
- Technology: