A 2-step Framework for Automated Literary Translation Evaluation: Its Promises and Pitfalls

Shafayat, Sheikh, Yoon, Dongkeun, Jang, Woori, Choi, Jiwoo, Oh, Alice, Jung, Seohyon

Jan-1-2025–arXiv.org Artificial Intelligence

In this work, we propose and evaluate the feasibility of a two-stage pipeline to evaluate literary machine translation, in a fine-grained manner, from English to Korean. The results show that our framework provides fine-grained, interpretable metrics suited for literary translation and obtains a higher correlation with human judgment than traditional machine translation metrics. Nonetheless, it still fails to match interhuman agreement, especially in metrics like Korean Honorifics. We also observe that LLMs tend to favor translations generated by other LLMs, and we highlight the necessity of developing more sophisticated evaluation methods to ensure accurate and culturally sensitive machine translation of literary works. Figure 1: The overview of our proposed framework: we evaluate translation of literary works in two stages.

large language model, machine learning, translation, (19 more...)

arXiv.org Artificial Intelligence

Jan-1-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:
- Research Report > New Finding (0.65)

Industry:
- Health & Medicine > Therapeutic Area (0.46)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.96)
  - Natural Language
    - Large Language Model (1.00)
    - Machine Translation (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found