An Evaluation of Persian-English Machine Translation Datasets with Transformers
Sartipi, Amir, Dehghan, Meghdad, Fatemi, Afsaneh
–arXiv.org Artificial Intelligence
Nowadays, many researchers are focusing their attention on the subject of machine translation (MT). However, Persian machine translation has remained unexplored despite a vast amount of research being conducted in languages with high resources, such as English. Moreover, while a substantial amount of research has been undertaken in statistical machine translation for some datasets in Persian, there is currently no standard baseline for transformer-based text2text models on each corpus. This study collected and analysed the most popular and valuable parallel corpora, which were used for Persian-English translation. Furthermore, we fine-tuned and evaluated two state-of-the-art attention-based seq2seq models on each dataset separately (48 results). We hope this paper will assist researchers in comparing their Persian to English and vice versa machine translation results Figure 1: Transformer model architecture to a standard baseline.
arXiv.org Artificial Intelligence
Feb-1-2023
- Country:
- Oceania
- New Zealand (0.04)
- Australia > Victoria
- Melbourne (0.04)
- North America > United States
- Pennsylvania (0.04)
- Maryland > Baltimore (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California > San Diego County
- San Diego (0.04)
- Europe
- Slovenia (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Italy > Tuscany
- Florence (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- Thailand > Chiang Mai
- Chiang Mai (0.04)
- Middle East
- Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Antalya Province > Antalya (0.04)
- Qatar > Ad-Dawhah
- Doha (0.04)
- Iran > Tehran Province
- Tehran (0.04)
- Republic of Türkiye
- Japan
- Kyūshū & Okinawa > Kyūshū
- Miyazaki Prefecture > Miyazaki (0.04)
- Honshū > Chūbu
- Aichi Prefecture > Nagoya (0.04)
- Kyūshū & Okinawa > Kyūshū
- Thailand > Chiang Mai
- Oceania
- Genre:
- Research Report (0.82)
- Technology: