AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

What Are Major NLP Achievements & Papers From 2019?

#artificialintelligenceNov-25-2019, 21:33:40 GMT

In 2018 we saw a number of landmark research breakthroughs in the field of natural language processing (NLP). The introduction of transfer learning and pretrained language models in NLP pushed forward the limits of language understanding and generation. These also dominated NLP progress this year. Teams from top research institutions and tech companies explored ways to make state-of-the-art language models even more sophisticated. Many improvements were driven by massive boosts in computing capacities, but many research groups also discovered ingenious ways to lighten models while maintaining high performance. In this article, we summarize 11 research papers covering key language models presented during the year as well as recent research breakthroughs in machine translation, sentiment analysis, dialogue systems, and abstractive summarization.

dataset, language model, translation, (16 more...)

#artificialintelligence

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.69)

Industry: Information Technology (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

ART: A machine learning Automated Recommendation Tool for synthetic biology

Radivojević, Tijana, Costello, Zak, Martin, Hector Garcia

arXiv.org Machine LearningNov-25-2019

Synthetic biology allows us to bioengineer cells to synthesize novel valuable molecules such as renewable biofuels or anticancer drugs. However, traditional synthetic biology approaches involve ad-hoc non systematic engineering practices, which lead to long development times. Here, we present the Automated Recommendation Tool ( ART), a tool that leverages machine learning and probabilistic modeling techniques to guide synthetic biology in a systematic fashion, without the need for a full mechanistic understanding of the biological system. Using sampling-based optimization, ART provides a set of recommended strains to be built in the next engineering cycle, alongside probabilistic predictions of their production levels. We demonstrate the capabilities of ART on simulated and real data sets and discuss possible difficulties in achieving satisfactory predictive power. 2 Introduction Metabolic engineering 1 enables us to bioengineer cells to synthesize novel valuable molecules such as renewable biofuels 2,3 or anticancer drugs.

dbtl cycle, prediction, recommendation, (15 more...)

arXiv.org Machine Learning

1911.11091

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
Europe > Portugal > Porto > Porto (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Energy > Renewable > Biofuel (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Assessing the accuracy of machine-assisted abstract screening with DistillerAI: a user study

#artificialintelligenceNov-24-2019, 23:05:43 GMT

Web applications that employ natural language processing technologies to support systematic reviewers during abstract screening have become more common. The goal of our project was to conduct a case study to explore a screening approach that temporarily replaces a human screener with a semi-automated screening tool. We evaluated the accuracy of the approach using DistillerAI as a semi-automated screening tool. A published comparative effectiveness review served as the reference standard. Five teams of professional systematic reviewers screened the same 2472 abstracts in parallel.

abstract screening, distillerai, screening, (12 more...)

#artificialintelligence

Genre: Research Report > Experimental Study (0.33)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.40)

Add feedback

Non-autoregressive Transformer by Position Learning

Bao, Yu, Zhou, Hao, Feng, Jiangtao, Wang, Mingxuan, Huang, Shujian, Chen, Jiajun, LI, Lei

arXiv.org Artificial IntelligenceNov-24-2019

Non-autoregressive models are promising on various text generation tasks. Previous work hardly considers to explicitly model the positions of generated words. However, position modeling is an essential problem in non-autoregressive text generation. In this study, we propose PNAT, which incorporates positions as a latent variable into the text generative process. Experimental results show that PNAT achieves top results on machine translation and paraphrase generation tasks, outperforming several strong baselines.

machine translation, transformer, translation, (15 more...)

arXiv.org Artificial Intelligence

1911.10677

Country:

North America > United States (0.14)
Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Microsoft adds Māori to translator as New Zealand pushes to revitalize the language – TechCrunch

#artificialintelligenceNov-22-2019, 20:12:38 GMT

The benefits of machine translation are easy to see and experience for ourselves, but those practical applications are only one part of what makes the technology valuable. Microsoft and the government of New Zealand are demonstrating the potential of translation tech to help preserve and hopefully breathe new life into the Māori language. Te reo Māori, as it is called in full, is of course the language of New Zealand's largest indigenous community. But as is common elsewhere as well, the tongue has fallen into obscurity as generations of Māori have assimilated into the dominant culture of their colonizers. Māori people make up about 15 percent of the population, and only a quarter of them speak the language, making for a grand total of 3 percent that speak te reo Māori.

new zealand, ori, translator, (4 more...)

#artificialintelligence

Country: Oceania > New Zealand > North Island > Waikato (0.06)

Industry: Government (0.94)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.74)

Add feedback

Optimizing Data Usage via Differentiable Rewards

Wang, Xinyi, Pham, Hieu, Michel, Paul, Anastasopoulos, Antonios, Neubig, Graham, Carbonell, Jaime

arXiv.org Machine LearningNov-22-2019

To acquire a new skill, humans learn better and faster if a tutor, based on their current knowledge level, informs them of how much attention they should pay to particular content or practice problems. Similarly, a machine learning model could potentially be trained better with a scorer that "adapts" to its current learning state and estimates the importance of each training data instance. Training such an adaptive scorer efficiently is a challenging problem; in order to precisely quantify the effect of a data instance at a given time during the training, it is typically necessary to first complete the entire training process. To efficiently optimize data usage, we propose a reinforcement learning approach called Differentiable Data Selection (DDS). In DDS, we formulate a scorer network as a learnable function of the training data, which can be efficiently updated along with the main model being trained. Specifically, DDS updates the scorer with an intuitive reward signal: it should up-weigh the data that has a similar gradient with a dev set upon which we would finally like to perform well. Without significant computing overhead, DDS delivers strong and consistent improvements over several strong baselines on two very different tasks of machine translation and image classification.

dev, image classification, machine translation, (13 more...)

arXiv.org Machine Learning

1911.10088

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.48)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Automatically Neutralizing Subjective Bias in Text

Pryzant, Reid, Martinez, Richard Diehl, Dass, Nathan, Kurohashi, Sadao, Jurafsky, Dan, Yang, Diyi

arXiv.org Artificial IntelligenceNov-21-2019

Texts like news, encyclopedias, and some social media strive for objectivity. Yet bias in the form of inappropriate subjectivity - introducing attitudes via framing, presupposing truth, and casting doubt - remains ubiquitous. This kind of bias erodes our collective trust and fuels social conflict. To address this issue, we introduce a novel testbed for natural language generation: automatically bringing inappropriately subjective text into a neutral point of view ("neutralizing" biased text). We also offer the first parallel corpus of biased language. The corpus contains 180,000 sentence pairs and originates from Wikipedia edits that removed various framings, presuppositions, and attitudes from biased sentences. Last, we propose two strong encoder-decoder baselines for the task. A straightforward yet opaque CONCURRENT system uses a BERT encoder to identify subjective words as part of the generation process. An interpretable and controllable MODULAR algorithm separates these steps, using (1) a BERT-based classifier to identify problematic words and (2) a novel join embedding through which the classifier can edit the hidden states of the encoder. Large-scale human evaluation across four domains (encyclopedias, news headlines, books, and political speeches) suggests that these algorithms are a first step towards the automatic identification and reduction of bias.

danescu-niculescu-mizil, proceedings, recasen, (16 more...)

arXiv.org Artificial Intelligence

1911.09709

Country:

Africa > Eswatini > Manzini > Manzini (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Media > News (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation

Guo, Junliang, Tan, Xu, Xu, Linli, Qin, Tao, Chen, Enhong, Liu, Tie-Yan

arXiv.org Machine LearningNov-21-2019

Non-autoregressive translation (NAT) models remove the dependence on previous target tokens and generate all target tokens in parallel, resulting in significant inference speedup but at the cost of inferior translation accuracy compared to autoregressive translation (AT) models. Considering that AT models have higher accuracy and are easier to train than NAT models, and both of them share the same model configurations, a natural idea to improve the accuracy of NAT models is to transfer a well-trained AT model to an NAT model through fine-tuning. However, since AT and NAT models differ greatly in training strategy, straightforward fine-tuning does not work well. In this work, we introduce curriculum learning into fine-tuning for NAT. Specifically, we design a curriculum in the fine-tuning process to progressively switch the training from autoregressive generation to non-autoregressive generation. Experiments on four benchmark translation datasets show that the proposed method achieves good improvement (more than $1$ BLEU score) over previous NAT baselines in terms of translation accuracy, and greatly speed up (more than $10$ times) the inference process over AT baselines.

curriculum, decoder input, translation, (13 more...)

arXiv.org Machine Learning

1911.08717

Country: Asia > China > Anhui Province (0.04)

Genre: Research Report (0.64)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

What Do You Mean `Why?': Resolving Sluices in Conversations

Hansen, Victor Petrén Bach, Søgaard, Anders

arXiv.org Artificial IntelligenceNov-21-2019

What Do Y ou Mean'Why?': Resolving Sluices in Conversations Victor Petr en Bach Hansen, 1 2 Anders Søgaard 1 3 1 Department of Computer Science, University of Copenhagen, Denmark 2 Topdanmark A/S, Denmark 3 Google Research, Berlin victor.petren@di.ku.dk, soegaard@di.ku.dk Abstract In conversation, we often ask one-word questions such as'Why?' or'Who?'. Such questions are typically easy for humans to answer, but can be hard for computers, because their resolution requires retrieving both the right semantic frames and the right arguments from context. This paper introduces the novel ellipsis resolution task of resolving such one-word questions, referred to as sluices in linguistics. We present a crowd-sourced dataset containing annotations of sluices from over 4,000 dialogues collected from conversational QA datasets, as well as a series of strong baseline architectures. 1 Introduction Stand-alone wh-word questions, such as When? in Figure 1, are easy for us to understand, but in order to interpret them we need to retrieve implicit information from context. Learning to do so is an instance of sluicing, an ellipsis phenomenon, defined by Ross (1969) as'the effect of deleting everything but the preposed constituent of an embedded question, under the condition that the remainder of the question is identical to some other part of the sentence, or a preceding sentence.' In the context of conversations, one-word wh-word questions are particularly frequent (Anand and Hardt 2016; Rønning, Hardt, and Søgaard 2018), and because they are often hard to resolve, they seem to be a frequent source of error in conversational question answering (Choi et al. 2018; Reddy, Chen, and Manning 2018) and dialogue understanding (Vlachos and Clark 2014). We refer to this type of sluicing as conversational sluicing . Unlike previous work where sluice resolution is treated as predicting the span of the antecedent (Anand and Hardt 2016; Rønning, Hardt, and Søgaard 2018), we frame conversational sluice resolution as a Natural Language Generation (NLG) task, in which we seek to automatically generate the full question, given a question-answer context and a one-word question. Q 1: Where was the bombing?

dataset, resolution, sluice, (16 more...)

arXiv.org Artificial Intelligence

1911.09478

Country:

Europe > Denmark > Capital Region > Copenhagen (0.24)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > Ohio > Franklin County > Columbus (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)

Add feedback

Visualisation of embedding relations (Word2Vec, BERT)

#artificialintelligenceNov-19-2019, 09:58:24 GMT

In this story, we will visualise the word embedding vectors to understand the relations between words described by the embeddings. This story focuses on word2vec [1] and BERT [2]. To understand the embeddings, I suggest reading a different introduction (like this) as this story does not aim to describe them. This story is part of my journey to develop Neural Machine Translation (NMT) using BERT contextualised embedding vectors. Word embeddings are models to generate computer-friendly numeric vector representations for words.

matrix, projection, vector, (15 more...)

#artificialintelligence

Country: North America > Mexico (0.06)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.63)

Add feedback