AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Weakly Supervised Grammatical Error Correction using Iterative Decoding

Lichtarge, Jared, Alberti, Christopher, Kumar, Shankar, Shazeer, Noam, Parmar, Niki

arXiv.org Machine LearningOct-30-2018

We describe an approach to Grammatical Error Correction (GEC) that is effective at making use of models trained on large amounts of weakly supervised bitext. We train the Transformer sequence-to-sequence model on 4B tokens of Wikipedia revisions and employ an iterative decoding strategy that is tailored to the loosely-supervised nature of the Wikipedia training corpus. Finetuning on the Lang-8 corpus and ensembling yields an F0.5 of 58.3 on the CoNLL'14 benchmark and a GLEU of 62.4 on JFLEG. The combination of weakly supervised training and iterative decoding obtains an F0.5 of 48.2 on CoNLL'14 even without using any labeled GEC data.

data quality, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1811.0171

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Quality > Data Cleaning (0.65)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.64)

Add feedback

Area Attention

Li, Yang, Kaiser, Lukasz, Bengio, Samy, Si, Si

arXiv.org Artificial IntelligenceOct-30-2018

Existing attention mechanisms, are mostly item-based in that a model is designed to attend to a single item in a collection of items (the memory). Intuitively, an area in the memory that may contain multiple items can be worth attending to as a whole. We propose area attention: a way to attend to an area of the memory, where each area contains a group of items that are either spatially adjacent when the memory has a 2-dimensional structure, such as images, or temporally adjacent for 1-dimensional memory, such as natural language sentences. Importantly, the size of an area, i.e., the number of items in an area, can vary depending on the learned coherence of the adjacent items. By giving the model the option to attend to an area of items, instead of only a single item, we hope attention mechanisms can better capture the nature of the task. Area attention can work along multi-head attention for attending to multiple areas in the memory. We evaluate area attention on two tasks: neural machine translation and image captioning, and improve upon strong (state-of-the-art) baselines in both cases. These improvements are obtainable with a basic form of area attention that is parameter free. In addition to proposing the novel concept of area attention, we contribute an efficient way for computing it by leveraging the technique of summed area tables.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

1810.10126

Country: North America > United States (0.15)

Genre: Research Report (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning to Teach with Dynamic Loss Functions

Wu, Lijun, Tian, Fei, Xia, Yingce, Fan, Yang, Qin, Tao, Lai, Jianhuang, Liu, Tie-Yan

arXiv.org Artificial IntelligenceOct-29-2018

Teaching is critical to human society: it is with teaching that prospective students are educated and human civilization can be inherited and advanced. A good teacher not only provides his/her students with qualified teaching materials (e.g., textbooks), but also sets up appropriate learning objectives (e.g., course projects and exams) considering different situations of a student. When it comes to artificial intelligence, treating machine learning models as students, the loss functions that are optimized act as perfect counterparts of the learning objective set by the teacher. In this work, we explore the possibility of imitating human teaching behaviors by dynamically and automatically outputting appropriate loss functions to train machine learning models. Different from typical learning settings in which the loss function of a machine learning model is predefined and fixed, in our framework, the loss function of a machine learning model (we call it student) is defined by another machine learning model (we call it teacher). The ultimate goal of teacher model is cultivating the student to have better performance measured on development dataset. Towards that end, similar to human teaching, the teacher, a parametric model, dynamically outputs different loss functions that will be used and optimized by its student model at different training stages. We develop an efficient learning method for the teacher model that makes gradient based optimization possible, exempt of the ineffective solutions such as policy optimization. We name our method as "learning to teach with dynamic loss functions" (L2T-DLF for short). Extensive experiments on real world tasks including image classification and neural machine translation demonstrate that our method significantly improves the quality of various student models.

machine learning, reinforcement learning, student model, (16 more...)

arXiv.org Artificial Intelligence

1810.12081

Country:

Asia > China (0.46)
North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Education > Educational Technology > Educational Software (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

Learning to Screen for Fast Softmax Inference on Large Vocabulary Neural Networks

Chen, Patrick H., Si, Si, Kumar, Sanjiv, Li, Yang, Hsieh, Cho-Jui

arXiv.org Machine LearningOct-29-2018

Neural language models have been widely used in various NLP tasks, including machine translation, next word prediction and conversational agents. However, it is challenging to deploy these models on mobile devices due to their slow prediction speed, where the bottleneck is to compute top candidates in the softmax layer. In this paper, we introduce a novel softmax layer approximation algorithm by exploiting the clustering structure of context vectors. Our algorithm uses a light-weight screening model to predict a much smaller set of candidate words based on the given context, and then conducts an exact softmax only within that subset. Training such a procedure end-to-end is challenging as traditional clustering methods are discrete and non-differentiable, and thus unable to be used with back-propagation in the training process. Using the Gumbel softmax, we are able to train the screening model end-to-end on the training set to exploit data distribution. The algorithm achieves an order of magnitude faster inference than the original softmax layer for predicting top-$k$ words in various tasks such as beam search in machine translation or next words prediction. For example, for machine translation task on German to English dataset with around 25K vocabulary, we can achieve 20.4 times speed up with 98.9\% precision@1 and 99.3\% precision@5 with the original softmax layer prediction, while state-of-the-art ~\citep{MSRprediction} only achieves 6.7x speedup with 98.7\% precision@1 and 98.1\% precision@5 for the same task.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

1810.12406

Country:

Asia > Vietnam (0.28)
North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Add feedback

Parallel Attention Mechanisms in Neural Machine Translation

Medina, Julian Richard, Kalita, Jugal

arXiv.org Artificial IntelligenceOct-29-2018

Abstract--Recent papers in neural machine translation have proposed the strict use of attention mechanisms over previous standards such as recurrent and convolutional neural networks (RNNs and CNNs). We propose that by running traditionally stacked encoding branches from encoder-decoder attentionfocused architectures in parallel, that even more sequential operations can be removed from the model and thereby decrease training time. In particular, we modify the recently published attention-based architecture called Transformer by Google, by replacing sequential attention modules with parallel ones, reducing the amount of training time and substantially improving BLEU scores at the same time. Experiments over the English to German and English to French translation tasks show that our model establishes a new state of the art. Historically, statistical machine translation involved extensive work in the alignment of words and phrases developed by linguistic experts working with computer scientists [1].

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

1810.12427

Country: North America > United States > Colorado (0.16)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bible is providing data to help create AI that can can convert texts

Daily Mail - Science & techOct-24-2018, 17:27:43 GMT

Scientists are now using the Bible to help algorithms perfect their language skills. An AI has been trained on various versions of the sacred text so it can convert written works into different styles for different audiences. Each version of the Bible contains more than 31,000 verses that the researchers used to produce over 1.5 million unique pairings of source and target verses. The Bible is helping algorithms perfect their translation skills. Internet tools that translate text between languages like English and Spanish are widely available.

artificial intelligence, bible, natural language, (16 more...)

Daily Mail - Science & tech

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.32)

Add feedback

Multi-Head Attention with Disagreement Regularization

Li, Jian, Tu, Zhaopeng, Yang, Baosong, Lyu, Michael R., Zhang, Tong

arXiv.org Artificial IntelligenceOct-24-2018

Multi-head attention is appealing for the ability to jointly attend to information from different representation subspaces at different positions. In this work, we introduce a disagreement regularization to explicitly encourage the diversity among multiple attention heads. Specifically, we propose three types of disagreement regularization, which respectively encourage the subspace, the attended positions, and the output representation associated with each attention head to be different from other heads. Experimental results on widely-used WMT14 English-German and WMT17 Chinese-English translation tasks demonstrate the effectiveness and universality of the proposed approach.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

1810.10183

Country: Asia > China (0.29)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.78)

Add feedback

Modeling Localness for Self-Attention Networks

Yang, Baosong, Tu, Zhaopeng, Wong, Derek F., Meng, Fandong, Chao, Lidia S., Zhang, Tong

arXiv.org Artificial IntelligenceOct-24-2018

Self-attention networks have proven to be of profound value for its strength of capturing global dependencies. In this work, we propose to model localness for self-attention networks, which enhances the ability of capturing useful local context. We cast localness modeling as a learnable Gaussian bias, which indicates the central and scope of the local region to be paid more attention. The bias is then incorporated into the original attention distribution to form a revised distribution. To maintain the strength of capturing long distance dependencies and enhance the ability of capturing short-range dependencies, we only apply localness modeling to lower layers of self-attention networks. Quantitative and qualitative analyses on Chinese-English and English-German translation tasks demonstrate the effectiveness and universality of the proposed approach.

localness, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

1810.10182

Country: Asia > Macao (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Exploiting Deep Representations for Neural Machine Translation

Dou, Zi-Yi, Tu, Zhaopeng, Wang, Xing, Shi, Shuming, Zhang, Tong

arXiv.org Artificial IntelligenceOct-24-2018

Advanced neural machine translation (NMT) models generally implement encoder and decoder as multiple layers, which allows systems to model complex functions and capture complicated linguistic structures. However, only the top layers of encoder and decoder are leveraged in the subsequent process, which misses the opportunity to exploit the useful information embedded in other layers. In this work, we propose to simultaneously expose all of these signals with layer aggregation and multi-layer attention mechanisms. In addition, we introduce an auxiliary regularization term to encourage different layers to capture diverse information. Experimental results on widely-used WMT14 English-German and WMT17 Chinese-English translation data demonstrate the effectiveness and universality of the proposed approach.

artificial intelligence, information, natural language, (17 more...)

arXiv.org Artificial Intelligence

1810.10181

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Baidu's AI Can Do Simultaneous Translation Between Any Two Languages

IEEE Spectrum RoboticsOct-23-2018, 23:49:24 GMT

Would-be travelers of the galaxy, rejoice: The Chinese tech giant Baidu has invented a translation system that brings us one step closer to a software Babel fish. For those unfamiliar with the Douglas Adams masterworks of science fiction, let me explain. The Babel fish is a slithery fictional creature that takes up residence in the ear canal of humans, tapping into their neural systems to provide instant translation of any language they hear. In the real world, until now, we've had to make do with human and software interpreters that do their best to keep up. But the new AI-powered tool from Baidu Research, called STACL, could speed things up considerably.

artificial intelligence, natural language, translation, (15 more...)

IEEE Spectrum Robotics

Industry:

Government (0.36)
Information Technology (0.36)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.92)

Add feedback