Translationese Reduction using Abstract Meaning Representation

Apr-22-2023–arXiv.org Artificial Intelligence

Translated texts or utterances bear several hallmarks distinct from texts originating in the language. This phenomenon, known as translationese, is well-documented, and when found in training or test sets can affect model performance. Still, work to mitigate the effect of translationese in human translated text is understudied. We hypothesize that Abstract Meaning Representation (AMR), a semantic representation which abstracts away from the surface form, can be used as an interlingua to reduce the amount of translationese in translated texts. By parsing English translations into an AMR graph and then generating text from that AMR, we obtain texts that more closely resemble non-translationese by macro-level measures. We show that across four metrics, and qualitatively, using AMR as an interlingua enables the reduction of translationese and we compare our results to two additional approaches: one based on round-trip machine translation and one based on syntactically controlled generation.

artificial intelligence, natural language, translationese, (16 more...)

arXiv.org Artificial Intelligence

Apr-22-2023

arXiv.org PDF

Add feedback

Country:
- South America > Chile
  - Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America
  - Dominican Republic (0.04)
  - United States
    - Pennsylvania (0.04)
    - Washington > King County
      - Seattle (0.14)
    - Oregon > Multnomah County
      - Portland (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe
  - Denmark (0.04)
  - Norway (0.04)
  - Slovenia (0.04)
  - Sweden (0.04)
  - United Kingdom > Scotland (0.04)
  - Bulgaria
    - Sofia City Province > Sofia (0.04)
    - Varna Province > Varna (0.04)
  - Iceland > Capital Region
    - Reykjavik (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Germany
    - Berlin (0.04)
    - Baden-Württemberg > Tübingen Region
      - Tübingen (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)
  - Spain
    - Valencian Community > Valencia Province
      - Valencia (0.04)
    - Galicia > A Coruña Province
      - Santiago de Compostela (0.04)
  - Portugal > Lisbon
    - Lisbon (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
- Asia
  - Thailand > Phuket
    - Phuket (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)
  - India > Karnataka
    - Bengaluru (0.04)

Genre:
- Research Report > New Finding (0.66)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found