Goto

Collaborating Authors

 Machine Translation


What Are Major NLP Achievements & Papers From 2019?

#artificialintelligence

In 2018 we saw a number of landmark research breakthroughs in the field of natural language processing (NLP). The introduction of transfer learning and pretrained language models in NLP pushed forward the limits of language understanding and generation. These also dominated NLP progress this year. Teams from top research institutions and tech companies explored ways to make state-of-the-art language models even more sophisticated. Many improvements were driven by massive boosts in computing capacities, but many research groups also discovered ingenious ways to lighten models while maintaining high performance. In this article, we summarize 11 research papers covering key language models presented during the year as well as recent research breakthroughs in machine translation, sentiment analysis, dialogue systems, and abstractive summarization.


ART: A machine learning Automated Recommendation Tool for synthetic biology

arXiv.org Machine Learning

Synthetic biology allows us to bioengineer cells to synthesize novel valuable molecules such as renewable biofuels or anticancer drugs. However, traditional synthetic biology approaches involve ad-hoc non systematic engineering practices, which lead to long development times. Here, we present the Automated Recommendation Tool ( ART), a tool that leverages machine learning and probabilistic modeling techniques to guide synthetic biology in a systematic fashion, without the need for a full mechanistic understanding of the biological system. Using sampling-based optimization, ART provides a set of recommended strains to be built in the next engineering cycle, alongside probabilistic predictions of their production levels. We demonstrate the capabilities of ART on simulated and real data sets and discuss possible difficulties in achieving satisfactory predictive power. 2 Introduction Metabolic engineering 1 enables us to bioengineer cells to synthesize novel valuable molecules such as renewable biofuels 2,3 or anticancer drugs.


Assessing the accuracy of machine-assisted abstract screening with DistillerAI: a user study

#artificialintelligence

Web applications that employ natural language processing technologies to support systematic reviewers during abstract screening have become more common. The goal of our project was to conduct a case study to explore a screening approach that temporarily replaces a human screener with a semi-automated screening tool. We evaluated the accuracy of the approach using DistillerAI as a semi-automated screening tool. A published comparative effectiveness review served as the reference standard. Five teams of professional systematic reviewers screened the same 2472 abstracts in parallel.


Non-autoregressive Transformer by Position Learning

arXiv.org Artificial Intelligence

Non-autoregressive models are promising on various text generation tasks. Previous work hardly considers to explicitly model the positions of generated words. However, position modeling is an essential problem in non-autoregressive text generation. In this study, we propose PNAT, which incorporates positions as a latent variable into the text generative process. Experimental results show that PNAT achieves top results on machine translation and paraphrase generation tasks, outperforming several strong baselines.


Microsoft adds Māori to translator as New Zealand pushes to revitalize the language – TechCrunch

#artificialintelligence

The benefits of machine translation are easy to see and experience for ourselves, but those practical applications are only one part of what makes the technology valuable. Microsoft and the government of New Zealand are demonstrating the potential of translation tech to help preserve and hopefully breathe new life into the Māori language. Te reo Māori, as it is called in full, is of course the language of New Zealand's largest indigenous community. But as is common elsewhere as well, the tongue has fallen into obscurity as generations of Māori have assimilated into the dominant culture of their colonizers. Māori people make up about 15 percent of the population, and only a quarter of them speak the language, making for a grand total of 3 percent that speak te reo Māori.


Optimizing Data Usage via Differentiable Rewards

arXiv.org Machine Learning

To acquire a new skill, humans learn better and faster if a tutor, based on their current knowledge level, informs them of how much attention they should pay to particular content or practice problems. Similarly, a machine learning model could potentially be trained better with a scorer that "adapts" to its current learning state and estimates the importance of each training data instance. Training such an adaptive scorer efficiently is a challenging problem; in order to precisely quantify the effect of a data instance at a given time during the training, it is typically necessary to first complete the entire training process. To efficiently optimize data usage, we propose a reinforcement learning approach called Differentiable Data Selection (DDS). In DDS, we formulate a scorer network as a learnable function of the training data, which can be efficiently updated along with the main model being trained. Specifically, DDS updates the scorer with an intuitive reward signal: it should up-weigh the data that has a similar gradient with a dev set upon which we would finally like to perform well. Without significant computing overhead, DDS delivers strong and consistent improvements over several strong baselines on two very different tasks of machine translation and image classification.


Automatically Neutralizing Subjective Bias in Text

arXiv.org Artificial Intelligence

Texts like news, encyclopedias, and some social media strive for objectivity. Yet bias in the form of inappropriate subjectivity - introducing attitudes via framing, presupposing truth, and casting doubt - remains ubiquitous. This kind of bias erodes our collective trust and fuels social conflict. To address this issue, we introduce a novel testbed for natural language generation: automatically bringing inappropriately subjective text into a neutral point of view ("neutralizing" biased text). We also offer the first parallel corpus of biased language. The corpus contains 180,000 sentence pairs and originates from Wikipedia edits that removed various framings, presuppositions, and attitudes from biased sentences. Last, we propose two strong encoder-decoder baselines for the task. A straightforward yet opaque CONCURRENT system uses a BERT encoder to identify subjective words as part of the generation process. An interpretable and controllable MODULAR algorithm separates these steps, using (1) a BERT-based classifier to identify problematic words and (2) a novel join embedding through which the classifier can edit the hidden states of the encoder. Large-scale human evaluation across four domains (encyclopedias, news headlines, books, and political speeches) suggests that these algorithms are a first step towards the automatic identification and reduction of bias.


Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation

arXiv.org Machine Learning

Non-autoregressive translation (NAT) models remove the dependence on previous target tokens and generate all target tokens in parallel, resulting in significant inference speedup but at the cost of inferior translation accuracy compared to autoregressive translation (AT) models. Considering that AT models have higher accuracy and are easier to train than NAT models, and both of them share the same model configurations, a natural idea to improve the accuracy of NAT models is to transfer a well-trained AT model to an NAT model through fine-tuning. However, since AT and NAT models differ greatly in training strategy, straightforward fine-tuning does not work well. In this work, we introduce curriculum learning into fine-tuning for NAT. Specifically, we design a curriculum in the fine-tuning process to progressively switch the training from autoregressive generation to non-autoregressive generation. Experiments on four benchmark translation datasets show that the proposed method achieves good improvement (more than $1$ BLEU score) over previous NAT baselines in terms of translation accuracy, and greatly speed up (more than $10$ times) the inference process over AT baselines.


What Do You Mean `Why?': Resolving Sluices in Conversations

arXiv.org Artificial Intelligence

What Do Y ou Mean'Why?': Resolving Sluices in Conversations Victor Petr en Bach Hansen, 1 2 Anders Søgaard 1 3 1 Department of Computer Science, University of Copenhagen, Denmark 2 Topdanmark A/S, Denmark 3 Google Research, Berlin victor.petren@di.ku.dk, soegaard@di.ku.dk Abstract In conversation, we often ask one-word questions such as'Why?' or'Who?'. Such questions are typically easy for humans to answer, but can be hard for computers, because their resolution requires retrieving both the right semantic frames and the right arguments from context. This paper introduces the novel ellipsis resolution task of resolving such one-word questions, referred to as sluices in linguistics. We present a crowd-sourced dataset containing annotations of sluices from over 4,000 dialogues collected from conversational QA datasets, as well as a series of strong baseline architectures. 1 Introduction Stand-alone wh-word questions, such as When? in Figure 1, are easy for us to understand, but in order to interpret them we need to retrieve implicit information from context. Learning to do so is an instance of sluicing, an ellipsis phenomenon, defined by Ross (1969) as'the effect of deleting everything but the preposed constituent of an embedded question, under the condition that the remainder of the question is identical to some other part of the sentence, or a preceding sentence.' In the context of conversations, one-word wh-word questions are particularly frequent (Anand and Hardt 2016; Rønning, Hardt, and Søgaard 2018), and because they are often hard to resolve, they seem to be a frequent source of error in conversational question answering (Choi et al. 2018; Reddy, Chen, and Manning 2018) and dialogue understanding (Vlachos and Clark 2014). We refer to this type of sluicing as conversational sluicing . Unlike previous work where sluice resolution is treated as predicting the span of the antecedent (Anand and Hardt 2016; Rønning, Hardt, and Søgaard 2018), we frame conversational sluice resolution as a Natural Language Generation (NLG) task, in which we seek to automatically generate the full question, given a question-answer context and a one-word question. Q 1: Where was the bombing?


Visualisation of embedding relations (Word2Vec, BERT)

#artificialintelligence

In this story, we will visualise the word embedding vectors to understand the relations between words described by the embeddings. This story focuses on word2vec [1] and BERT [2]. To understand the embeddings, I suggest reading a different introduction (like this) as this story does not aim to describe them. This story is part of my journey to develop Neural Machine Translation (NMT) using BERT contextualised embedding vectors. Word embeddings are models to generate computer-friendly numeric vector representations for words.