Machine Translation
Neural Machine Translation with Python – Towards Data Science
Machine translation, sometimes referred to by the abbreviation MT is a very challenge task that investigates the use of software to translate text or speech from one language to another. Traditionally, it involves large statistical models developed using highly sophisticated linguistic knowledge. Here we are, we are going to use deep neural networks for the problem of machine translation. We will discover how to develop a neural machine translation model for translating English to French. Our model will accept English text as input and return the French translation.
Title Generation for Web Tables
Hancock, Braden, Lee, Hongrae, Yu, Cong
Descriptive titles provide crucial context for interpreting tables that are extracted from web pages and are a key component of table-based web applications. Prior approaches have attempted to produce titles by selecting existing text snippets associated with the table. These approaches, however, are limited by their dependence on suitable titles existing a priori. In our user study, we observe that the relevant information for the title tends to be scattered across the page, and often---more than 80% of time---does not appear verbatim anywhere in the page. We propose instead the application of a sequence-to-sequence neural network model as a more generalizable means of generating high-quality titles. This is accomplished by extracting many text snippets that have potentially relevant information to the table, encoding them into an input sequence, and using both copy and generation mechanisms in the decoder to balance relevance and readability of the generated title. We validate this approach with human evaluation on sample web tables and report that while sequence models with only a copy mechanism or only a generation mechanism are easily outperformed by simple selection-based baselines, the model with both capabilities outperforms them all, approaching the quality of crowdsourced titles while training on fewer than ten thousand examples. To the best of our knowledge, the proposed technique is the first to consider text-generation methods for table titles, and establishes a new state of the art.
Neural Machine Translation for Query Construction and Composition
Soru, Tommaso, Marx, Edgard, Valdestilhas, André, Esteves, Diego, Moussallem, Diego, Publio, Gustavo
Research on question answering with knowledge base has recently seen an increasing use of deep architectures. In this extended abstract, we study the application of the neural machine translation paradigm for question parsing. We employ a sequence-to-sequence model to learn graph patterns in the SPARQL graph query language and their compositions. Instead of inducing the programs through question-answer pairs, we expect a semi-supervised approach, where alignments between questions and queries are built through templates. We argue that the coverage of language utterances can be expanded using late notable works in natural language generation.
On Adversarial Examples for Character-Level Neural Machine Translation
Ebrahimi, Javid, Lowd, Daniel, Dou, Dejing
Evaluating on adversarial examples has become a standard procedure to measure robustness of deep learning models. Due to the difficulty of creating white-box adversarial examples for discrete text input, most analyses of the robustness of NLP models have been done through black-box adversarial examples. We investigate adversarial examples for character-level neural machine translation (NMT), and contrast black-box adversaries with a novel white-box adversary, which employs differentiable string-edit operations to rank adversarial changes. We propose two novel types of attacks which aim to remove or change a word in a translation, rather than simply break the NMT. We demonstrate that white-box adversarial examples are significantly stronger than their black-box counterparts in different attack scenarios, which show more serious vulnerabilities than previously known. In addition, after performing adversarial training, which takes only 3 times longer than regular training, we can improve the model's robustness significantly.
IFlytek, CIPG Will Build National AI Translator to Meet Rising Demand
China's top voice recognition firm iFlytek has penned a deal with China International Publishing Group to build a national artificial intelligence translator and keep up with rising demand. AI translations can lift the burden off human translators, who can barely keep up with requirements at government departments and companies looking to operate overseas, state-owned news agency Xinhua cited CIPG Deputy Director Fang Zhenghui as saying. The machine can translate Chinese into 33 languages, added Liu Qingfeng, president of Anhui-based iFlytek, saying it uses cutting-edge technology to improve the accuracy of machine translations. "When translation machines fail to recognize some special nouns or specific terms, human translators can monitor the process and help to polish the text," he said. "The machine [can] learn from these mistakes and improve its work next time."
Salesforce research
Deep learning has significantly improved state-of-the-art performance for natural language processing tasks like machine translation, summarization, question answering, and text classification. Each of these tasks is typically studied with a specific metric, and performance is often measured on a set of standard benchmark datasets. This has led to the development of architectures designed specifically for those tasks and metrics, but it does not necessarily promote the emergence of general NLP models, those which can perform well across a wide variety of NLP tasks. In order to explore the possibility of such models as well as the tradeoffs that arise in optimizing for them, we introduce the Natural Language Decathlon (decaNLP). The goal of the Decathlon is to explore models that generalize to all ten tasks and investigate how such models differ from those trained for single tasks.
The Natural Language Decathlon: Multitask Learning as Question Answering
McCann, Bryan, Keskar, Nitish Shirish, Xiong, Caiming, Socher, Richard
Deep learning has improved performance on many natural language processing (NLP) tasks individually. However, general NLP models cannot emerge within a paradigm that focuses on the particularities of a single metric, dataset, and task. We introduce the Natural Language Decathlon (decaNLP), a challenge that spans ten tasks: question answering, machine translation, summarization, natural language inference, sentiment analysis, semantic role labeling, zero-shot relation extraction, goal-oriented dialogue, semantic parsing, and commonsense pronoun resolution. We cast all tasks as question answering over a context. Furthermore, we present a new Multitask Question Answering Network (MQAN) jointly learns all tasks in decaNLP without any task-specific modules or parameters in the multitask setting. MQAN shows improvements in transfer learning for machine translation and named entity recognition, domain adaptation for sentiment analysis and natural language inference, and zero-shot capabilities for text classification. We demonstrate that the MQAN's multi-pointer-generator decoder is key to this success and performance further improves with an anti-curriculum training strategy. Though designed for decaNLP, MQAN also achieves state of the art results on the WikiSQL semantic parsing task in the single-task setting. We also release code for procuring and processing data, training and evaluating models, and reproducing all experiments for decaNLP.
Learned in Translation: Contextualized Word Vectors
McCann, Bryan, Bradbury, James, Xiong, Caiming, Socher, Richard
Computer vision has benefited from initializing multiple deep layers with weights pretrained on large supervised training sets like ImageNet. Natural language processing (NLP) typically sees initialization of only the lowest layer of deep models with pretrained word vectors. In this paper, we use a deep LSTM encoder from an attentional sequence-to-sequence model trained for machine translation (MT) to contextualize word vectors. We show that adding these context vectors (CoVe) improves performance over using only unsupervised word and character vectors on a wide variety of common NLP tasks: sentiment analysis (SST, IMDb), question classification (TREC), entailment (SNLI), and question answering (SQuAD). For fine-grained sentiment analysis and entailment, CoVe improves performance of our baseline models to the state of the art.
AI Weekly: Google's research center in Ghana won't be the last AI lab in Africa
This year, we have seen an acceleration of Silicon Valley tech giants opening AI research labs around the world as they seek to gain traction among researchers and fulfill their global ambitions. In the past six months or so, Google brought labs to China and France, Facebook opened labs in Pittsburgh and Seattle, and Microsoft announced plans to open labs near universities in Berkeley, California and Melbourne, Australia. This trend shows no signs of slowing down. Last month, Samsung announced labs in Cambridge, Moscow, and Toronto. This week, Nvidia announced plans to open a new lab in Toronto, while Google shared plans to open a lab in Accra, Ghana, Google's first in Africa and perhaps the first of any tech giant in Africa.
Google Translate: How does the search giant's multilingual interpreter actually work?
Google Translate has become the internet's go-to resource for short, quick translations from foreign languages. The service was first launched in April 2006, seeing off early competition from the likes of Babel Fish. It now boasts more than 500m users daily worldwide, offering 103 languages. But how exactly does it work? How does Google News actually work?