Language Modelling Approaches to Adaptive Machine Translation
–arXiv.org Artificial Intelligence
Consistency is a key requirement of high-quality translation. It is especially important to adhere to pre-approved terminology and adapt to corrected translations in domain-specific projects. Machine translation (MT) has achieved significant progress in the area of domain adaptation. However, in-domain data scarcity is common in translation settings, due to the lack of specialised datasets and terminology, or inconsistency and inaccuracy of available in-domain translations. In such scenarios where there is insufficient in-domain data to fine-tune MT models, producing translations that are consistent with the relevant context is challenging. While real-time adaptation can make use of smaller amounts of in-domain data to improve the translation on the fly, it remains challenging due to supported context limitations and efficiency constraints. Large language models (LLMs) have recently shown interesting capabilities of in-context learning, where they learn to replicate certain input-output text generation patterns, without further fine-tuning. Such capabilities have opened new horizons for domain-specific data augmentation and real-time adaptive MT. This work attempts to address two main relevant questions: 1) in scenarios involving human interaction and continuous feedback, can we employ language models to improve the quality of adaptive MT at inference time? and 2) in the absence of sufficient in-domain data, can we use pre-trained large-scale language models to improve the process of MT domain adaptation?
arXiv.org Artificial Intelligence
Jan-25-2024
- Country:
- Oceania > Australia
- North America
- Dominican Republic (0.04)
- United States
- Texas > Travis County
- Austin (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Massachusetts
- Suffolk County > Boston (0.04)
- Middlesex County > Cambridge (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- California > San Diego County
- San Diego (0.04)
- Texas > Travis County
- Canada
- Ontario > Toronto (0.04)
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Europe
- Germany > Berlin (0.04)
- Switzerland > Zürich
- Zürich (0.04)
- Bulgaria > Varna Province
- Varna (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Sweden > Uppsala County
- Uppsala (0.04)
- Finland
- Portugal > Lisbon
- Lisbon (0.13)
- Italy
- Tuscany > Florence (0.04)
- Calabria > Catanzaro Province
- Catanzaro (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Ukraine > Kyiv Oblast
- Kyiv (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- United Kingdom > Scotland
- City of Edinburgh > Edinburgh (0.04)
- Asia
- Singapore (0.04)
- China > Hong Kong (0.04)
- Macao (0.04)
- Vietnam > Da Nang
- Da Nang (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- Middle East
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.14)
- Republic of Türkiye > Istanbul Province
- Istanbul (0.04)
- Qatar > Ad-Dawhah
- Doha (0.04)
- India > Bihar
- Patna (0.04)
- Japan
- Kyūshū & Okinawa > Kyūshū
- Miyazaki Prefecture > Miyazaki (0.04)
- Honshū > Chūbu
- Toyama Prefecture > Toyama (0.04)
- Aichi Prefecture > Nagoya (0.04)
- Kyūshū & Okinawa > Kyūshū
- South Korea > Incheon
- Incheon (0.04)
- Africa > Rwanda
- Genre:
- Workflow (1.00)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (0.92)
- Industry:
- Law (0.92)
- Information Technology (0.92)
- Health & Medicine > Therapeutic Area
- Infections and Infectious Diseases (1.00)
- Immunology (0.92)
- Technology: