WeTS: A Benchmark for Translation Suggestion

Yang, Zhen, Meng, Fandong, Zhang, Yingxue, Li, Ernan, Zhou, Jie

Oct-10-2022–arXiv.org Artificial Intelligence

Translation Suggestion (TS), which provides alternatives for specific words or phrases given the entire documents translated by machine translation (MT) \cite{lee2021intellicat}, has been proven to play a significant role in post editing (PE). However, there is still no publicly available data set to support in-depth research for this problem, and no reproducible experimental results can be followed by researchers in this community. To break this limitation, we create a benchmark data set for TS, called \emph{WeTS}, which contains golden corpus annotated by expert translators on four translation directions. Apart from the human-annotated golden corpus, we also propose several novel methods to generate synthetic corpus which can substantially improve the performance of TS. With the corpus we construct, we introduce the Transformer-based model for TS, and experimental results show that our model achieves State-Of-The-Art (SOTA) results on all four translation directions, including English-to-German, German-to-English, Chinese-to-English and English-to-Chinese. Codes and corpus can be found at https://github.com/ZhenYangIACAS/WeTS.git.

machine learning, natural language, translation, (19 more...)

arXiv.org Artificial Intelligence

Oct-10-2022

arXiv.org PDF

Add feedback

Country:
- North America
  - Haiti (0.04)
  - Dominican Republic (0.04)
  - United States
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - California > San Diego County
      - San Diego (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe
  - France (0.14)
  - Germany > Berlin (0.04)
  - Belgium (0.04)
- Asia
  - Indonesia > Bali (0.04)
  - China > Beijing
    - Beijing (0.04)

Genre:
- Research Report > New Finding (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Machine Translation (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found