MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation
Riley, Parker, Deutsch, Daniel, Finkelstein, Mara, DiIanni, Colten, Juraska, Juraj, Freitag, Markus
–arXiv.org Artificial Intelligence
Human evaluation of machine translation is in an arms race with translation model quality: as our models get better, our evaluation methods need to be improved to ensure that quality gains are not lost in evaluation noise. To this end, we experiment with a two-stage version of the current state-of-the-art translation evaluation paradigm (MQM), which we call MQM re-annotation. In this setup, an MQM annotator reviews and edits a set of pre-existing MQM annotations, that may have come from themselves, another human annotator, or an automatic MQM annotation system. We demonstrate that rater behavior in re-annotation aligns with our goals, and that re-annotation results in higher-quality annotations, mostly due to finding errors that were missed during the first pass.
arXiv.org Artificial Intelligence
Oct-29-2025
- Country:
- Asia
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Singapore (0.05)
- Middle East > UAE
- Europe > Ireland
- Leinster > County Dublin > Dublin (0.04)
- North America
- Mexico > Mexico City
- Mexico City (0.04)
- United States > Florida
- Miami-Dade County > Miami (0.05)
- Mexico > Mexico City
- Asia
- Genre:
- Research Report (0.65)
- Technology: