Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs
Conia, Simone, Lee, Daniel, Li, Min, Minhas, Umar Farooq, Potdar, Saloni, Li, Yunyao
–arXiv.org Artificial Intelligence
Translating text that contains entity names is a challenging task, as cultural-related references can vary significantly across languages. These variations may also be caused by transcreation, an adaptation process that entails more than transliteration and word-for-word translation. In this paper, we address the problem of cross-cultural translation on two fronts: (i) we introduce XC-Translate, the first large-scale, manually-created benchmark for machine translation that focuses on text that contains potentially culturally-nuanced entity names, and (ii) we propose KG-MT, a novel end-to-end method to integrate information from a multilingual knowledge graph into a neural machine translation model by leveraging a dense retrieval mechanism. Our experiments and analyses show that current machine translation systems and large language models still struggle to translate texts containing entity names, whereas KG-MT outperforms state-of-the-art approaches by a large margin, obtaining a 129% and 62% relative improvement compared to NLLB-200 and GPT-4, respectively.
arXiv.org Artificial Intelligence
Oct-17-2024
- Country:
- Africa > Niger (0.04)
- Asia > Singapore (0.04)
- Europe
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Italy
- Trentino-Alto Adige/Südtirol > Trentino Province
- Trento (0.04)
- Tuscany > Florence (0.04)
- Trentino-Alto Adige/Südtirol > Trentino Province
- Slovenia (0.04)
- Ireland > Leinster
- North America
- Canada
- United States
- California
- Los Angeles County > Long Beach (0.04)
- San Francisco County > San Francisco (0.14)
- Florida > Miami-Dade County
- Miami (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Washington > King County
- Seattle (0.14)
- California
- Genre:
- Overview > Innovation (0.34)
- Research Report > Promising Solution (0.34)
- Technology: