Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey
Saunders, Danielle (a:1:{s:5:"en_US";s:7:"SDL plc";})
–Journal of Artificial Intelligence Research
The development of deep learning techniques has allowed Neural Machine Translation (NMT) models to become extremely powerful, given sufficient training data and training time. However, systems struggle when translating text from a new domain with a distinct style or vocabulary. Fine-tuning on in-domain data allows good domain adaptation, but requires sufficient relevant bilingual data. Even if this is available, simple fine-tuning can cause overfitting to new data and catastrophic forgetting of previously learned behaviour. We survey approaches to domain adaptation for NMT, particularly where a system may need to translate across multiple domains. We divide techniques into those revolving around data selection or generation, model architecture, parameter adaptation procedure, and inference procedure. We finally highlight the benefits of domain adaptation and multidomain adaptation techniques to other lines of NMT research.
Journal of Artificial Intelligence Research
Sep-29-2022
- Country:
- Oceania > Australia
- Victoria > Melbourne (0.04)
- New South Wales > Sydney (0.04)
- North America
- Dominican Republic (0.04)
- United States
- Texas > Travis County
- Austin (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- California > San Diego County
- San Diego (0.04)
- Colorado > Denver County
- Denver (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- New York
- New York County > New York City (0.04)
- Monroe County > Rochester (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Oregon > Multnomah County
- Portland (0.04)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- Washington > King County
- Seattle (0.04)
- Massachusetts
- Suffolk County > Boston (0.14)
- Middlesex County > Cambridge (0.04)
- Texas > Travis County
- Canada > British Columbia
- Europe
- Austria (0.04)
- Czechia > Prague (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- Bulgaria
- Varna Province > Varna (0.04)
- Sofia City Province > Sofia (0.04)
- Slovakia > Bratislava
- Bratislava (0.04)
- Italy > Tuscany
- Florence (0.05)
- Germany
- Berlin (0.04)
- Bavaria > Upper Bavaria
- Munich (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Spain
- Galicia > Madrid (0.04)
- Valencian Community > Valencia Province
- Valencia (0.04)
- Catalonia > Barcelona Province
- Barcelona (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Greece > Attica
- Athens (0.04)
- Portugal > Lisbon
- Lisbon (0.14)
- Ukraine > Kyiv Oblast
- Kyiv (0.04)
- Sweden
- Uppsala County > Uppsala (0.04)
- Stockholm > Stockholm (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- United Kingdom
- Scotland > City of Edinburgh
- Edinburgh (0.04)
- England > Cambridgeshire
- Cambridge (0.27)
- Scotland > City of Edinburgh
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- Taiwan > Taiwan Province
- Taipei (0.04)
- South Korea > Incheon
- Incheon (0.04)
- Middle East
- Japan
- Kyūshū & Okinawa > Kyūshū
- Miyazaki Prefecture > Miyazaki (0.04)
- Honshū
- Kantō > Tokyo Metropolis Prefecture
- Tokyo (0.14)
- Kansai > Osaka Prefecture
- Osaka (0.04)
- Kantō > Tokyo Metropolis Prefecture
- Kyūshū & Okinawa > Kyūshū
- India > Bihar
- Patna (0.04)
- China
- Taiwan > Taiwan Province
- Oceania > Australia
- Genre:
- Overview (1.00)
- Research Report (0.67)
- Instructional Material > Course Syllabus & Notes (0.46)
- Industry:
- Information Technology > Security & Privacy (0.92)
- Health & Medicine (0.67)
- Technology: