Cogs in a Machine, Doing What They're Meant to Do -- The AMI Submission to the WMT24 General Translation Task
Jasonarson, Atli, Hafsteinsson, Hinrik, Ármannsson, Bjarki, Steingrímsson, Steinþór
–arXiv.org Artificial Intelligence
This paper presents the submission of the \'Arni Magnusson Institute's team to the WMT24 General translation task. We work on the English->Icelandic translation direction. Our system comprises four translation models and a grammar correction model. For training our models we carefully curate our datasets, aggressively filtering out sentence pairs that may detrimentally affect the quality of our system's output. Some of our data are collected from human translations and some are synthetically generated. A part of the synthetic data is generated using an LLM, and we find that it increases the translation capability of our system significantly.
arXiv.org Artificial Intelligence
Oct-4-2024
- Country:
- Oceania > Australia
- North America
- United States
- New York (0.04)
- California > Los Angeles County
- Long Beach (0.04)
- Canada > Ontario
- Toronto (0.04)
- United States
- Europe
- Spain (0.04)
- Iceland > Capital Region
- Reykjavik (0.04)
- Finland
- Southwest Finland > Turku (0.04)
- Pirkanmaa > Tampere (0.04)
- Portugal > Lisbon
- Lisbon (0.14)
- Czechia > South Moravian Region
- Brno (0.04)
- Belgium > Flanders
- Flemish Brabant > Leuven (0.04)
- East Flanders > Ghent (0.04)
- Sweden > Östergötland County
- Linköping (0.04)
- Faroe Islands > Streymoy
- Tórshavn (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Estonia > Tartu County
- Tartu (0.04)
- Asia
- Singapore (0.05)
- Japan (0.04)
- Taiwan > Taiwan Province
- Taipei (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Genre:
- Research Report (1.00)
- Technology: