Collaborating Authors



Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. We also provide pre-trained models for several benchmark translation datasets. If you use Docker make sure to increase the shared memory size either with --ipc host or --shm-size as command line options to nvidia-docker run. This model uses a Byte Pair Encoding (BPE) vocabulary, so we'll have to apply the encoding to the source text before it can be translated. Prior to BPE, input text needs to be tokenized using tokenizer.perl

Findings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction Artificial Intelligence

We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction. This is a classification task in which participants are asked to provide predictions on what pronoun class label should replace a placeholder value in the target-language text, provided in lemma-tised and PoS-tagged form. We provided four subtasks, for the English-French and English-German language pairs, in both directions. Eleven teams participated in the shared task; nine for the English-French subtask, five for French-English, nine for English-German, and six for German-English. Most of the submissions outperformed two strong language-model- based baseline systems, with systems using deep recurrent neural networks outperforming those using other architectures for most language pairs.

State-Of-The-Art Methods For Neural Machine Translation & Multilingual Tasks


The quality of machine translation produced by state-of-the-art models is already quite high and often requires only minor corrections from professional human translators. This is especially true for high-resource language pairs like English-German and English-French. So, the main focus of recent research studies in machine translation was on improving system performance for low-resource language pairs, where we have access to large monolingual corpora in each language but do not have sufficiently large parallel corpora. Facebook AI researchers seem to lead in this research area and have introduced several interesting solutions for low-resource machine translation during the last year. This includes augmenting the training data with back-translation, learning joint multilingual sentence representations, as well as extending BERT to a cross-lingual setting.

FRAGE: Frequency-Agnostic Word Representation Machine Learning

Continuous word representation (aka word embedding) is a basic building block in many neural network-based models used in natural language processing tasks. Although it is widely accepted that words with similar semantics should be close to each other in the embedding space, we find that word embeddings learned in several tasks are biased towards word frequency: the embeddings of high-frequency and low-frequency words lie in different subregions of the embedding space, and the embedding of a rare word and a popular word can be far from each other even if they are semantically similar. This makes learned word embeddings ineffective, especially for rare words, and consequently limits the performance of these neural network models. In this paper, we develop a neat, simple yet effective way to learn \emph{FRequency-AGnostic word Embedding} (FRAGE) using adversarial training. We conducted comprehensive studies on ten datasets across four natural language processing tasks, including word similarity, language modeling, machine translation and text classification. Results show that with FRAGE, we achieve higher performance than the baselines in all tasks.

Multimodal Machine Translation with Reinforcement Learning Artificial Intelligence

Multimodal machine translation is one of the applications that integrates computer vision and language processing. It is a unique task given that in the field of machine translation, many state-of-the-arts algorithms still only employ textual information. In this work, we explore the effectiveness of reinforcement learning in multimodal machine translation. We present a novel algorithm based on the Advantage Actor-Critic (A2C) algorithm that specifically cater to the multimodal machine translation task of the EMNLP 2018 Third Conference on Machine Translation (WMT18). We experiment our proposed algorithm on the Multi30K multilingual English-German image description dataset and the Flickr30K image entity dataset. Our model takes two channels of inputs, image and text, uses translation evaluation metrics as training rewards, and achieves better results than supervised learning MLE baseline models. Furthermore, we discuss the prospects and limitations of using reinforcement learning for machine translation. Our experiment results suggest a promising reinforcement learning solution to the general task of multimodal sequence to sequence learning.