AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Improving Robustness of Task Oriented Dialog Systems

Einolghozati, Arash, Gupta, Sonal, Mohit, Mrinal, Shah, Rushin

arXiv.org Artificial IntelligenceNov-12-2019

Task oriented language understanding in dialog systems is often modeled using intents (task of a query) and slots (parameters for that task). Intent detection and slot tagging are, in turn, modeled using sentence classification and word tagging techniques respectively. Similar to adversarial attack problems with computer vision models discussed in existing literature, these intent-slot tagging models are often over-sensitive to small variations in input -- predicting different and often incorrect labels when small changes are made to a query, thus reducing their accuracy and reliability. However, evaluating a model's robustness to these changes is harder for language since words are discrete and an automated change (e.g. adding `noise') to a query sometimes changes the meaning and thus labels of a query. In this paper, we first describe how to create an adversarial test set to measure the robustness of these models. Furthermore, we introduce and adapt adversarial training methods as well as data augmentation using back-translation to mitigate these issues. Our experiments show that both techniques improve the robustness of the system substantially and can be combined to yield the best results.

accuracy, adversarial example, data augmentation, (15 more...)

arXiv.org Artificial Intelligence

1911.05153

Country:

Asia (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Virginia (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.95)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Syntax-Infused Transformer and BERT models for Machine Translation and Natural Language Understanding

Sundararaman, Dhanasekar, Subramanian, Vivek, Wang, Guoyin, Si, Shijing, Shen, Dinghan, Wang, Dong, Carin, Lawrence

arXiv.org Machine LearningNov-9-2019

Attention-based models have shown significant improvement over traditional algorithms in several NLP tasks. The Transformer, for instance, is an illustrative example that generates abstract representations of tokens inputted to an encoder based on their relationships to all tokens in a sequence. Recent studies have shown that although such models are capable of learning syntactic features purely by seeing examples, explicitly feeding this information to deep learning models can significantly enhance their performance. Leveraging syntactic information like part of speech (POS) may be particularly beneficial in limited training data settings for complex models such as the Transformer. We show that the syntax-infused Transformer with multiple features achieves an improvement of 0.7 BLEU when trained on the full WMT '14 English to German translation dataset and a maximum improvement of 1.99 BLEU points when trained on a fraction of the dataset. In addition, we find that the incorporation of syntax into BERT fine-tuning outperforms baseline on a number of downstream tasks from the GLUE benchmark. Introduction Attention-based deep learning models for natural language processing (NLP) have shown promise for a variety of machine translation and natural language understanding tasks. For word-level, sequence-to-sequence tasks such as translation, paraphrasing, and text summarization, attention-based models allow a single token ( e.g., a word or subword) in a sequence to be represented as a combination of all tokens in the sequence (Luong, Pham, and Manning, 2015). The distributed context allows attention-based models to infer rich representations for tokens, leading to more robust performance.

subword, transformer, translation, (15 more...)

arXiv.org Machine Learning

1911.06156

Country:

North America > United States > North Carolina > Durham County > Durham (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > China (0.04)
(2 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Massive Collection of Cross-Lingual Web-Document Pairs

El-Kishky, Ahmed, Chaudhary, Vishrav, Guzman, Francisco, Koehn, Philipp

arXiv.org Machine LearningNov-9-2019

Cross-lingual document alignment aims to identify pairs of documents in two distinct languages that are of comparable content or translations of each other. Small-scale efforts have been made to collect aligned document level data on a limited set of language-pairs such as English-German or on limited comparable collections such as Wikipedia. In this paper, we mine twelve snapshots of the Common Crawl corpus and identify web document pairs that are translations of each other. We release a new web dataset consisting of 54 million URL pairs from Common Crawl covering documents in 92 languages paired with English. We evaluate the quality of the dataset by measuring the quality of machine translations from models that have been trained on mined parallel sentence pairs from this aligned corpora and introduce a simple yet effective baseline for identifying these aligned documents. The objective of this dataset and paper is to foster new research in cross-lingual NLP across a variety of low, mid, and high-resource languages.

computational linguistic, document pair, proceedings, (14 more...)

arXiv.org Machine Learning

1911.06154

Country:

Europe > Italy > Tuscany > Florence (0.04)
Europe > Iceland > Capital Region > Reykjavik (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Modelling Bahdanau Attention using Election methods aided by Q-Learning

Bal, Rakesh, Sinha, Sayan

arXiv.org Machine LearningNov-9-2019

Neural Machine Translation has lately gained a lot of "attention" with the advent of more and more sophisticated but drastically improved models. Attention mechanism has proved to be a boon in this direction by providing weights to the input words, making it easy for the decoder to identify words representing the present context. But by and by, as newer attention models with more complexity came into development, they involved large computation, making inference slow. In this paper, we have modelled the attention network using techniques resonating with social choice theory. Along with that, the attention mechanism, being a Markov Decision Process, has been represented by reinforcement learning techniques. Thus, we propose to use an election method ( k -Borda), fine-tuned using Q-learning, as a replacement for attention networks. The inference time for this network is less than a standard Bahdanau translator, and the results of the translation are comparable. This not only experimentally verifies the claims stated above but also helped provide a faster inference.

attention mechanism, attention weight, cosine distance, (13 more...)

arXiv.org Machine Learning

1911.03853

Country: Asia > India > West Bengal > Kharagpur (0.05)

Genre: Research Report (0.50)

Industry: Government > Voting & Elections (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Add feedback

Instance-based Transfer Learning for Multilingual Deep Retrieval

Arnold, Andrew O., Cohen, William W.

arXiv.org Machine LearningNov-8-2019

Perhaps the simplest type of multilingual transfer learning is instance-based transfer learning, in which data from the target language and the auxiliary languages are pooled, and a single model is learned from the pooled data. It is not immediately obvious when instance-based transfer learning will improve performance in this multilingual setting: for instance, a plausible conjecture is this kind of transfer learning would help only if the auxiliary languages were very similar to the target. Here we show that at large scale, this method is surprisingly effective, leading to positive transfer on all of 35 target languages we tested. We analyze this improvement and argue that the most natural explanation, namely direct vocabulary overlap between languages, only partially explains the performance gains: in fact, we demonstrate target-language improvement can occur after adding data from an auxiliary language with no vocabulary in common with the target. This surprising result is due to the effect of transitive vocabulary overlaps between pairs of auxiliary and target languages.

instance-based transfer, overlap, target language, (14 more...)

arXiv.org Machine Learning

1911.06111

Country:

Europe > Ukraine > Kyiv Oblast > Kyiv (0.15)
Europe > Belgium > Brussels-Capital Region > Brussels (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
(4 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)

Add feedback

Biconditional Generative Adversarial Networks for Multiview Learning with Missing Views

Doinychko, Anastasiia, Amini, Massih-Reza

arXiv.org Machine LearningNov-7-2019

In this paper, we present a conditional GAN with two generators and a common discriminator for multiview learning problems where observations have two views, but one of them may be missing for some of the training samples. This is for example the case for multilingual collections where documents are not available in all languages. Some studies tackled this problem by assuming the existence of view generation functions to approximately complete the missing views; for example Machine Translation to translate documents into the missing languages. These functions generally require an external resource to be set and their quality has a direct impact on the performance of the learned multiview classifier over the completed training set. Our proposed approach addresses this problem by jointly learning the missing views and the multiview classifier using a tripartite game with two generators and a discriminator. Each of the generators is associated to one of the views and tries to fool the discriminator by generating the other missing view conditionally on the corresponding observed view. The discriminator then tries to identify if for an observation, one of its views is completed by one of the generators or if both views are completed along with its class. Our results on a subset of Reuters RCV1/RCV2 collections show that the discriminator achieves significant classification performance; and that the generators learn the missing views with high quality without the need of any consequent external resource.

cond 2, discriminator, generator, (15 more...)

arXiv.org Machine Learning

1911.01861

Country:

North America > Saint Martin (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Can Neural Networks Learn Symbolic Rewriting?

Piotrowski, Bartosz, Urban, Josef, Brown, Chad E., Kaliszyk, Cezary

arXiv.org Artificial IntelligenceNov-7-2019

This work investigates if the current neural architectures are adequate for learning symbolic rewriting. Two kinds of data sets are proposed for this research -- one based on automated proofs and the other being a synthetic set of polynomial terms. The experiments with use of the current neural machine translation models are performed and its results are discussed. Ideas for extending this line of research are proposed and its relevance is motivated.

experiment, international conference, polynomial, (12 more...)

arXiv.org Artificial Intelligence

1911.04873

Country:

Europe > Czechia > Prague (0.05)
South America > Brazil (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.96)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.90)

Add feedback

Google's New AI Milestone: Neural Machine Translation Engine Can Now Translate 103 Languages

#artificialintelligenceNov-6-2019, 23:28:40 GMT

Neural Machine Translation (NMT), one of the most important topics in deep learning, has gained much attention from the industries and academia over the last few years. In order to create simple models out of the complex ones, tech giant Google has been doing a lot of innovations in the domain of human to machine and machine to human translations for quite a few years now. Back in 2017, the tech giant introduced a solution to use a simple Neural Machine Translation (NMT) model to translate between multiple languages where the researchers merged 12 language pairs into a single model. Models into three types which are many-to-one, one-to-many and many-to-many models. Recently, the researchers at Google AI Team built a more enhanced system for neural machine translation (NMT) and published a paper known as "Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges".

neural machine translation, neural machine translation engine, new ai milestone, (7 more...)

#artificialintelligence

Industry: Information Technology (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Microsoft Research Asia's Systems for WMT19

Xia, Yingce, Tan, Xu, Tian, Fei, Gao, Fei, Chen, Weicong, Fan, Yang, Gong, Linyuan, Leng, Yichong, Luo, Renqian, Wang, Yiren, Wu, Lijun, Zhu, Jinhua, Qin, Tao, Liu, Tie-Yan

arXiv.org Machine LearningNov-6-2019

Yingce Xia, Xu T an, Fei Tian, Fei Gao, Weicong Chen, Y ang Fan, Linyuan Gong, Yichong Leng, Renqian Luo, Yiren Wang, Lijun Wu, Jinhua Zhu, T ao Qin, Tie-Y an Liu Microsoft Research Asia Abstract We Microsoft Research Asia made submissions to 11 language directions in the WMT19 news translation tasks. We won the first place for 8 of the 11 directions and the second place for the other three. Our basic systems are built on Transformer, back translation and knowledge distillation. We integrate several of our rececent techniques to enhance the baseline systems: multi-agent dual learning (MADL), masked sequence-to-sequence pre-training (MASS), neural architecture optimization (NAO), and soft contextual data augmentation (SCA). 1 Introduction We participated in the WMT19 shared news translation task in 11 translation directions. We achieved first place for 8 directions: German English, German French, Chinese English, English Lithuanian, English Finnish, and Russian English, and three other directions were placed second (ranked by teams), which included Lithuanian English, Finnish English, and English Kazakh. Our basic systems are based on Transformer, back translation and knowledge distillation. We experimented with several techniques we proposed recently. In brief, the innovations we introduced are: Multi-agent dual learning (MADL) The core idea of dual learning is to leverage the duality between the primal task (mapping from domain X to domain Y) and dual task (mapping from domain Y to X) to boost the performances of both tasks. MADL (Wang et al., 2019) extends the dual learning (He et al., 2016; Xia et al., 2017a) framework by introducing multiple primal and dual models. It was integrated into our submitted systems for*Corresponding author.

knowledge distillation, monolingual data, translation, (15 more...)

arXiv.org Machine Learning

1911.06191

Country:

Asia > China (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Domain, Translationese and Noise in Synthetic Data for Neural Machine Translation

Bogoychev, Nikolay, Sennrich, Rico

arXiv.org Machine LearningNov-6-2019

The quality of neural machine translation can be improved by leveraging additional monolingual resources to create synthetic training data. Source-side monolingual data can be (forward-)translated into the target language for self-training; target-side monolingual data can be back-translated. It has been widely reported that back-translation delivers superior results, but could this be due to artefacts in the test sets? W e perform a case study using French-English news translation task and separate test sets based on their original languages. W e show that forward translation delivers superior gains in terms of BLEU on sentences that were originally in the source language, complementing previous studies which show large improvements with back-translation on sentences that were originally in the target language. To better understand when and why forward and back-translation are effective, we study the role of domains, translationese, and noise. While translationese effects are well known to influence MT evaluation, we also find evidence that news data from different languages shows subtle domain differences, which is another explanation for varying performance on different portions of the test set. W e perform additional low-resource experiments which demonstrate that forward translation is more sensitive to the quality of the initial translation system than back-translation, and tends to perform worse in low-resource settings.

computational linguistic, forward translation, translation, (14 more...)

arXiv.org Machine Learning

1911.03362

Country:

Europe > Italy > Tuscany > Florence (0.05)
Europe > Germany > Berlin (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
(13 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback