Analyzing Context Utilization of LLMs in Document-Level Translation
Mohammed, Wafaa, Niculae, Vlad
–arXiv.org Artificial Intelligence
Large language models (LLM) are increasingly strong contenders in machine translation. We study document-level translation, where some words cannot be translated without context from outside the sentence. We investigate the ability of prominent LLMs to utilize context by analyzing models' robustness to perturbed and randomized document context. We find that LLMs' improved document-translation performance is not always reflected in pronoun translation performance. We highlight the need for context-aware finetuning of LLMs with a focus on relevant parts of the context to improve their reliability for document-level translation.
arXiv.org Artificial Intelligence
Oct-18-2024
- Country:
- North America
- United States
- Minnesota > Hennepin County
- Minneapolis (0.04)
- California > Los Angeles County
- Long Beach (0.04)
- Minnesota > Hennepin County
- Canada > Ontario
- Toronto (0.04)
- United States
- Europe
- Portugal > Lisbon
- Lisbon (0.14)
- Netherlands > North Holland
- Amsterdam (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Italy > Trentino-Alto Adige/Südtirol
- Trentino Province > Trento (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Bulgaria > Sofia City Province
- Sofia (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Portugal > Lisbon
- Asia
- Singapore (0.04)
- China > Hong Kong (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- North America
- Genre:
- Research Report (0.64)
- Technology: