Evaluating Optimal Reference Translations
Zouhar, Vilém, Kloudová, Věra, Popel, Martin, Bojar, Ondřej
–arXiv.org Artificial Intelligence
Machine translation (MT) is routinely evaluated using various segment-level similarity metrics against one or more reference translations. At the same time, reference translations acquired in the standard way are often criticized for their flaws of various types. For several high-resourced language pairs, MT quality reaches levels comparable to the quality of the reference translation (Freitag et al. 2022; Hassan et al. 2018) and sometimes MT even significantly surpasses humans in a particular evaluation setting (Popel et al. 2020). Given this, one could conclude that state-of-the-art MT has reached the point where reference-based evaluation is no longer reliable and we have to resort to other methods (such as targeted expert evaluation of particular outputs), even if they are costly, subjective and possibly impossible to automate. The narrow goal of the presented work is to allow for an "extension of the expiry date" for reference-based evaluation methods. In a broader perspective, we want to formulate a methodology for creating reference translations which avoid the often-observed deficiencies of "standard" or "professional" reference translations, be it multiple interfering phenomena, inappropriate expressions, ignorance of topic-focus articulation (information structure) or other abundant shortcomings in the translation, indicating their authors' insensitivity to the topic itself, but above all to the source and target language. To this end, we introduce so-called optimal reference translations (ORT), which are intended to represent optimal (ideal or excellent) human translations (should they be the subject of a translation quality evaluation).
arXiv.org Artificial Intelligence
Nov-28-2023
- Country:
- Asia
- Europe > United Kingdom
- England (0.28)
- North America > United States (0.68)
- Genre:
- Research Report (1.00)
- Industry:
- Government
- Military (1.00)
- Regional Government
- Asia Government (0.46)
- North America Government > United States Government (0.46)
- Health & Medicine (0.93)
- Law (0.67)
- Government
- Technology: