Defining Boundaries: The Impact of Domain Specification on Cross-Language and Cross-Domain Transfer in Machine Translation
Shahnazaryan, Lia, Beloucif, Meriem
–arXiv.org Artificial Intelligence
Recent advancements in neural machine translation (NMT) have revolutionized the field, yet the dependency on extensive parallel corpora limits progress for low-resource languages. Cross-lingual transfer learning offers a promising solution by utilizing data from high-resource languages but often struggles with in-domain NMT. In this paper, we investigate three pivotal aspects: enhancing the domain-specific quality of NMT by fine-tuning domain-relevant data from different language pairs, identifying which domains are transferable in zero-shot scenarios, and assessing the impact of language-specific versus domain-specific factors on adaptation effectiveness. Using English as the source language and Spanish for fine-tuning, we evaluate multiple target languages including Portuguese, Italian, French, Czech, Polish, and Greek. Our findings reveal significant improvements in domain-specific translation quality, especially in specialized fields such as medical, legal, and IT, underscoring the importance of well-defined domain data and transparency of the experiment setup in in-domain transfer learning.
arXiv.org Artificial Intelligence
Aug-21-2024
- Country:
- Asia
- Japan > Kyūshū & Okinawa
- Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
- Middle East
- Republic of Türkiye > Istanbul Province
- Istanbul (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Republic of Türkiye > Istanbul Province
- Japan > Kyūshū & Okinawa
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Italy > Tuscany
- Florence (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Poland > Masovia Province
- Warsaw (0.04)
- Spain (0.04)
- Sweden > Uppsala County
- Uppsala (0.04)
- Belgium > Brussels-Capital Region
- North America
- Canada > Ontario
- Toronto (0.04)
- Dominican Republic (0.04)
- United States
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Pennsylvania (0.04)
- Texas > Travis County
- Austin (0.04)
- Minnesota > Hennepin County
- Canada > Ontario
- Asia
- Genre:
- Research Report
- Experimental Study (0.94)
- New Finding (1.00)
- Research Report
- Technology: