Mitigating Hallucinated Translations in Large Language Models with Hallucination-focused Preference Optimization
Tang, Zilu, Chatterjee, Rajen, Garg, Sarthak
–arXiv.org Artificial Intelligence
Machine Translation (MT) is undergoing a paradigm shift, with systems based on fine-tuned large language models (LLM) becoming increasingly competitive with traditional encoder-decoder models trained specifically for translation tasks. However, LLM-based systems are at a higher risk of generating hallucinations, which can severely undermine user's trust and safety. Most prior research on hallucination mitigation focuses on traditional MT models, with solutions that involve post-hoc mitigation - detecting hallucinated translations and re-translating them. While effective, this approach introduces additional complexity in deploying extra tools in production and also increases latency. To address these limitations, we propose a method that intrinsically learns to mitigate hallucinations during the model training phase. Specifically, we introduce a data creation framework to generate hallucination focused preference datasets. Fine-tuning LLMs on these preference datasets reduces the hallucination rate by an average of 96% across five language pairs, while preserving overall translation quality. In a zero-shot setting our approach reduces hallucinations by 89% on an average across three unseen target languages.
arXiv.org Artificial Intelligence
Jan-28-2025
- Country:
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- Asia
- India
- Chhattisgarh (0.04)
- Madhya Pradesh (0.04)
- Mizoram (0.04)
- Rajasthan (0.04)
- Telangana (0.04)
- Middle East
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Singapore (0.04)
- India
- Europe
- Belgium > Flanders
- East Flanders > Ghent (0.04)
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Italy > Tuscany
- Florence (0.04)
- Middle East > Malta (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Spain
- Catalonia > Barcelona Province
- Barcelona (0.04)
- Valencian Community > Valencia Province
- Valencia (0.04)
- Catalonia > Barcelona Province
- Belgium > Flanders
- North America
- Canada > Ontario
- Toronto (0.04)
- United States
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- New York > New York County
- New York City (0.04)
- Louisiana > Orleans Parish
- Canada > Ontario
- Africa > Ethiopia
- Genre:
- Research Report > New Finding (0.67)
- Technology: