Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite

Jwalapuram, Prathyusha, Joty, Shafiq, Temnikova, Irina, Nakov, Preslav

Aug-31-2019–arXiv.org Artificial Intelligence

The ongoing neural revolution in machine translation has made it easier to model larger contexts beyond the sentence-level, which can potentially help resolve some discourse-level ambiguities such as pronominal anaphora, thus enabling better translations. Unfortunately, even when the resulting improvements are seen as substantial by humans, they remain virtually unnoticed by traditional automatic evaluation measures like BLEU, as only a few words end up being affected. Thus, specialized evaluation measures are needed. With this aim in mind, we contribute an extensive, targeted dataset that can be used as a test suite for pronoun translation, covering multiple source languages and different pronoun errors drawn from real system translations, for English. We further propose an evaluation measure to differentiate good and bad pronoun translations. We also conduct a user study to report correlations with human judgments.

artificial intelligence, natural language, translation, (13 more...)

arXiv.org Artificial Intelligence

Aug-31-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States (1.00)
- Europe (1.00)
- Asia > Middle East
  - Qatar (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found