AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

The Quest for Human Parity Machine Translation

#artificialintelligenceApr-2-2021, 19:07:38 GMT

Recently some in the Singularity community have admitted that "language is hard" as you can see in this attempt to explain why AI has not mastered translation yet. Michael Housman, a faculty member of Singularity University, explained that the ideal scenario for machine learning and artificial intelligence is something with fixed rules and a clear-cut measure of success or failure. He named chess as an obvious example and noted machines were able to beat the best human Go player. This happened faster than anyone anticipated because of the game's very clear rules and limited set of moves. Housman elaborated, "Language is almost the opposite of that. There aren't as clearly-cut and defined rules. The conversation can go in an infinite number of different directions. And then of course, you need labeled data. You need to tell the machine to do it right or wrong."

human parity machine translation, positive experience, translator, (3 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (0.57)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.51)

Add feedback

Attention Forcing for Machine Translation

Dou, Qingyun, Lu, Yiting, Manakul, Potsawee, Wu, Xixin, Gales, Mark J. F.

arXiv.org Artificial IntelligenceApr-2-2021

Auto-regressive sequence-to-sequence models with attention mechanisms have achieved state-of-the-art performance in various tasks including Text-To-Speech (TTS) and Neural Machine Translation (NMT). The standard training approach, teacher forcing, guides a model with the reference output history. At inference stage, the generated output history must be used. This mismatch can impact performance. However, it is highly challenging to train the model using the generated output. Several approaches have been proposed to address this problem, normally by selectively using the generated output history. To make training stable, these approaches often require a heuristic schedule or an auxiliary classifier. This paper introduces attention forcing for NMT. This approach guides the model with the generated output history and reference attention, and can reduce the training-inference mismatch without a schedule or a classifier. Attention forcing has been successful in TTS, but its application to NMT is more challenging, due to the discrete and multi-modal nature of the output space. To tackle this problem, this paper adds a selection scheme to vanilla attention forcing, which automatically selects a suitable training approach for each pair of training data. Experiments show that attention forcing can improve the overall translation quality and the diversity of the translations.

arxiv preprint arxiv, output history, translation, (12 more...)

arXiv.org Artificial Intelligence

2104.01264

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

How we taught Google Translate to stop being sexist

#artificialintelligenceApr-1-2021, 16:15:16 GMT

Online translation tools have helped us learn new languages, communicate across linguistic borders, and view foreign websites in our native tongue. But the artificial intelligence (AI) behind them is far from perfect, often replicating rather than rejecting the biases that exist within a language or a society. Such tools are especially vulnerable to gender stereotyping because some languages (such as English) don't tend to gender nouns, while others (such as German) do. When translating from English to German, translation tools have to decide which gender to assign English words like "cleaner." Overwhelmingly, the tools conform to the stereotype, opting for the feminine word in German.

gender, google translate, translation tool, (13 more...)

#artificialintelligence

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.06)

Genre: Research Report (0.31)

Industry: Education (0.36)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Many-to-English Machine Translation Tools, Data, and Pretrained Models

Gowda, Thamme, Zhang, Zhao, Mattmann, Chris A, May, Jonathan

arXiv.org Artificial IntelligenceApr-1-2021

While there are more than 7000 languages in the world, most translation research efforts have targeted a few high-resource languages. Commercial translation systems support only one hundred languages or fewer, and do not make these models available for transfer to low resource languages. In this work, we present useful tools for machine translation research: MTData, NLCodec, and RTG. We demonstrate their usefulness by creating a multilingual neural machine translation model capable of translating from 500 source languages to English. We make this multilingual model readily downloadable and usable as a service, or as a parent model for transfer-learning to even lower-resource languages.

computational linguistic, proceedings, translation, (13 more...)

arXiv.org Artificial Intelligence

2104.0029

Country:

Europe > Portugal > Lisbon > Lisbon (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
(23 more...)

Genre: Research Report (0.40)

Industry: Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Towards General Purpose Vision Systems

Gupta, Tanmay, Kamath, Amita, Kembhavi, Aniruddha, Hoiem, Derek

arXiv.org Artificial IntelligenceApr-1-2021

A special purpose learning system assumes knowledge of admissible tasks at design time. Adapting such a system to unforeseen tasks requires architecture manipulation such as adding an output head for each new task or dataset. In this work, we propose a task-agnostic vision-language system that accepts an image and a natural language task description and outputs bounding boxes, confidences, and text. The system supports a wide range of vision tasks such as classification, localization, question answering, captioning, and more. We evaluate the system's ability to learn multiple skills simultaneously, to perform tasks with novel skill-concept combinations, and to learn new skills efficiently and without forgetting.

architecture, classification, gpv-i, (16 more...)

arXiv.org Artificial Intelligence

2104.00743

Country: North America > United States > Illinois (0.04)

Genre: Research Report (0.64)

Industry:

Transportation > Ground > Road (0.46)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.46)

Add feedback

Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study

Shen, Zhiqiang, Liu, Zechun, Xu, Dejia, Chen, Zitian, Cheng, Kwang-Ting, Savvides, Marios

arXiv.org Artificial IntelligenceApr-1-2021

This work aims to empirically clarify a recently discovered perspective that label smoothing is incompatible with knowledge distillation (Müller et al., 2019). We begin by introducing the motivation behind on how this incompatibility is raised, i.e., label smoothing erases relative information between teacher logits. We provide a novel connection on how label smoothing affects distributions of semantically similar and dissimilar classes. Then we propose a metric to quantitatively measure the degree of erased information in sample's representation. After that, we study its one-sidedness and imperfection of the incompatibility view through massive analyses, visualizations and comprehensive experiments on Image Classification, Binary Networks, and Neural Machine Translation. Finally, we broadly discuss several circumstances wherein label smoothing will indeed lose its effectiveness. Recently a large body of studies is focusing on exploring the underlying relationships between these two methods, for instance, Müller et al. (Müller et al., 2019) discovered that label smoothing could improve calibration implicitly but will hurt the effectiveness of knowledge distillation. Yuan et al. (Yuan et al., 2019) considered knowledge distillation as a dynamical form of label smoothing as it delivered a regularization effect in training. The recent study (Lukasik et al., 2020) further noticed label smoothing could help mitigate label noise, they showed that when distilling models from noisy data, the teacher with label smoothing is helpful.

knowledge distillation, student, teacher network, (13 more...)

arXiv.org Artificial Intelligence

2104.00676

Genre: Research Report > New Finding (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

English-Twi Parallel Corpus for Machine Translation

Azunre, Paul, Osei, Salomey, Addo, Salomey, Adu-Gyamfi, Lawrence Asamoah, Moore, Stephen, Adabankah, Bernard, Opoku, Bernard, Asare-Nyarko, Clara, Nyarko, Samuel, Amoaba, Cynthia, Appiah, Esther Dansoa, Akwerh, Felix, Lawson, Richard Nii Lante, Budu, Joel, Debrah, Emmanuel, Boateng, Nana, Ofori, Wisdom, Buabeng-Munkoh, Edwin, Adjei, Franklin, Ampomah, Isaac Kojo Essel, Otoo, Joseph, Borkor, Reindorf, Mensah, Standylove Birago, Mensah, Lucien, Marcel, Mark Amoako, Amponsah, Anokye Acheampong, Hayfron-Acquah, James Ben

arXiv.org Artificial IntelligenceApr-1-2021

We present a parallel machine translation training corpus for English and Akuapem Twi of 25,421 sentence pairs. We used a transformer-based translator to generate initial translations in Akuapem Twi, which were later verified and corrected where necessary by native speakers to eliminate any occurrence of translationese. In addition, 697 higher quality crowd-sourced sentences are provided for use as an evaluation set for downstream Natural Language Processing (NLP) tasks. The typical use case for the larger human-verified dataset is for further training of machine translation models in Akuapem Twi. The higher quality 697 crowd-sourced dataset is recommended as a testing dataset for machine translation of English to Twi and Twi to English models. Furthermore, the Twi part of the crowd-sourced data may also be used for other tasks, such as representation learning, classification, etc. We fine-tune the transformer translation model on the training corpus and report benchmarks on the crowd-sourced test set.

akuapem twi, corpus, translation, (12 more...)

arXiv.org Artificial Intelligence

2103.15625

Country:

Europe > Finland > Uusimaa > Helsinki (0.05)
Europe > Norway (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

NLP for Ghanaian Languages

arXiv.org Artificial IntelligenceApr-1-2021

In the much-applauded interventions by Google The advancement in machine learning computational and Microsoft through their translation services, power coupled with the recent investment quite a number of African languages have been within the domain by technological companies integrated, but Ghanaian languages are excluded has stimulated considerable interest and (Google, 2020; Microsoft, 2021). A historic move brought about a legion of applications in natural worth mentioning is Baidu Translate's incorporation language digitisation in developed countries, of the Twi language in their translation service.

ghana, ghanaian language, nlp ghana, (16 more...)

arXiv.org Artificial Intelligence

2103.15475

Country:

Africa > Ghana > Greater Accra > Accra (0.05)
Africa > Togo (0.04)
Africa > Sub-Saharan Africa (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Online translators are sexist – here's how we gave them a little gender sensitivity training

#artificialintelligenceMar-31-2021, 09:00:07 GMT

Online translation tools have helped us learn new languages, communicate across linguistic borders, and view foreign websites in our native tongue. But the artificial intelligence (AI) behind them is far from perfect, often replicating rather than rejecting the biases that exist within a language or a society. Such tools are especially vulnerable to gender stereotyping, because some languages (such as English) don't tend to gender nouns, while others (such as German) do. When translating from English to German, translation tools have to decide which gender to assign English words like "cleaner". Overwhelmingly, the tools conform to the stereotype, opting for the feminine word in German.

engineer, gender, translation tool, (13 more...)

#artificialintelligence

Industry: Education (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Zero-Shot Language Transfer vs Iterative Back Translation for Unsupervised Machine Translation

Joshi, Aviral, Huang, Chengzhi, Singh, Har Simrat

arXiv.org Artificial IntelligenceMar-31-2021

This work focuses on comparing different solutions for machine translation on low resource language pairs, namely, with zero-shot transfer learning and unsupervised machine translation. We discuss how the data size affects the performance of both unsupervised MT and transfer learning. Additionally we also look at how the domain of the data affects the result of unsupervised MT. The code to all the experiments performed in this project are accessible on Github.

experiment, language pair, translation, (14 more...)

arXiv.org Artificial Intelligence

2104.00106

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Slovenia (0.04)
Asia > Thailand > Phuket > Phuket (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback