AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

XL-Editor: Post-editing Sentences with XLNet

Shih, Yong-Siang, Chang, Wei-Cheng, Yang, Yiming

arXiv.org Machine LearningOct-19-2019

While neural sequence generation models achieve initial su c-cess for many NLP applications, the canonical decoding procedure with left-to-right generation order (i.e., autoreg res-sive) in one-pass can not reflect the true nature of human revising a sentence to obtain a refined result. In this work, we propose XL-Editor, a novel training framework that enables state-of-the-art generalized autoregressive pretrainin g methods, XLNet specifically, to revise a given sentence by the variable-length insertion probability. Concretely, XL-E ditor can (1) estimate the probability of inserting a variable-le ngth sequence into a specific position of a given sentence; (2) execute post-editing operations such as insertion, deletion, and replacement based on the estimated variable-length insert ion probability; (3) complement existing sequence-to-sequen ce models to refine the generated sequences. Empirically, we first demonstrate better post-editing capabilities of XL-E ditor over XLNet on the text insertion and deletion tasks, which validates the effectiveness of our proposed framework. Fur - thermore, we extend XL-Editor to the unpaired text style transfer task, where transferring the target style onto a gi ven sentence can be naturally viewed as post-editing the senten ce into the target style. XL-Editor achieves significant impro ve-ment in style transfer accuracy and also maintains coherent semantic of the original sentence, showing the broad applic ability of our method.

probability, sequence, xl-editor, (14 more...)

arXiv.org Machine Learning

1910.10479

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Samoa (0.04)
Oceania > New Zealand (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Facebook makes big advances in AI reasoning and machine translation - SiliconANGLE

#artificialintelligenceOct-17-2019, 14:37:50 GMT

Facebook Inc. is using its @Scale conference today to provide an update on its progress in artificial intelligence research. The social media company is open-sourcing a new "AI reasoning" platform and providing some updates on its research into machine translation. It's part of a broad push to scale up AI workloads, a difficult task given the massive amounts of data needed to train AI models, Srinivas Narayanan (pictured), the lead for Facebook's Applied AI Research, said this morning at the conference in San Jose, California. "Facebook wouldn't be where it is today without AI," Narayanan said. "It's deeply integrated into everything we do."

facebook, monolingual data, translation, (16 more...)

#artificialintelligence

Country: North America > United States > California > Santa Clara County > San Jose (0.25)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

A language processing algorithm for predicting tactical solutions to an operational planning problem under uncertainty

Frejinger, Emma, Larsen, Eric

arXiv.org Machine LearningOct-17-2019

This paper is devoted to the prediction of solutions to a stochastic discrete optimization problem. Through an application, we illustrate how we can use a state-of-the-art neural machine translation (NMT) algorithm to predict the solutions by defining appropriate vocabularies, syntaxes and constraints. We attend to applications where the predictions need to be computed in very short computing time -- in the order of milliseconds or less. The results show that with minimal adaptations to the model architecture and hyperparameter tuning, the NMT algorithm can produce accurate solutions within the computing time budget. While these predictions are slightly less accurate than approximate stochastic programming solutions (sample average approximation), they can be computed faster and with less variability.

approximator, container, sequence, (17 more...)

arXiv.org Machine Learning

1910.08216

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > Canada > Quebec > Montreal (0.04)
North America > United States > California > Monterey County > Monterey (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Overcoming the Rare Word Problem for Low-Resource Language Pairs in Neural Machine Translation

Ngo, Thi-Vinh, Ha, Thanh-Le, Nguyen, Phuong-Thai, Nguyen, Le-Minh

arXiv.org Machine LearningOct-17-2019

Among the six challenges of neural machine translation (NMT) coined by ( Koehn and Knowles, 2017), rare-word problem is considered the most severe one, especially in translation of low-resource languages. In this paper, we propose three solutions to address the rare words in neural machine translation systems. First, we enhance source context to predict the target words by connecting directly the source embeddings to the output of the attention component in NMT. Second, we propose an algorithm to learn morphology of unknown words for English in supervised way in order to minimize the adverse effect of rare-word problem. Finally, we exploit synonymous relation from the W ordNet to overcome out-of-vocabulary (OOV) problem of NMT. W e evaluate our approaches on two low-resource language pairs: English-Vietnamese and Japanese-Vietnamese. In our experiments, we have achieved significant improvements of up to roughly 1.0 BLEU points in both language pairs.

machine translation, proceedings, translation, (13 more...)

arXiv.org Machine Learning

1910.03467

Country:

Asia > Vietnam > Thái Nguyên Province > Thái Nguyên (0.05)
North America > Canada (0.04)
Europe > Spain (0.04)
(7 more...)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Translation by the numbers: Facebook AI puts words into multidimensional spaces

The Japan TimesOct-16-2019, 09:20:24 GMT

PARIS – Designers of machine translation tools still mostly rely on dictionaries to make a foreign language understandable. But now there is a new way: numbers. Facebook researchers say rendering words into figures and exploiting mathematical similarities between languages is a promising avenue -- even if a universal communicator as seen in "Star Trek" remains a distant dream. Powerful automatic translation is a big priority for internet giants. Allowing as many people as possible worldwide to communicate is not just an altruistic goal, but also good business.

facebook, multidimensional space, translation, (8 more...)

The Japan Times

Country:

Europe > Spain > Galicia > Madrid (0.06)
Europe > Russia (0.06)
Europe > France (0.06)
(2 more...)

Industry: Information Technology > Services (0.98)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Lost in Translation?

#artificialintelligenceOct-16-2019, 00:41:26 GMT

Fueled by improvements in speech recognition, machine learning, better algorithms, cloud processing, and more powerful computing devices, the quality of machine translations is improving. Learning another language has never been a simple proposition. It can take months of study to absorb the basics and years to become fluent. Of course, there's the added headache that learning a language doesn't help if a person encounters one of the world's other 7,000 or so languages. "There has always been a need for human translators and interpreters," says Andrew Ochoa, CEO of translation technology firm Waverly Labs.

google translate, machine translation, translation, (12 more...)

#artificialintelligence

AI-Alerts: 2019 > 2019-10 > AAAI AI-Alert for Oct 22, 2019 (1.00)

Country:

North America > United States > Oregon > Clackamas County > West Linn (0.05)
North America > United States > Maryland > Prince George's County > College Park (0.05)
Europe > United Kingdom (0.05)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MLQA: Evaluating Cross-lingual Extractive Question Answering

Lewis, Patrick, Oğuz, Barlas, Rinott, Ruty, Riedel, Sebastian, Schwenk, Holger

arXiv.org Artificial IntelligenceOct-16-2019

Question answering (QA) models have shown rapid progress enabled by the availability of large, high-quality benchmark datasets. Such annotated datasets are difficult and costly to collect, and rarely exist in languages other than English, making training QA systems in other languages challenging. An alternative to building large monolingual training datasets is to develop cross-lingual systems which can transfer to a target language without requiring training data in that language. In order to develop such systems, it is crucial to invest in high quality multilingual evaluation benchmarks to measure progress. We present MLQA, a multi-way aligned extractive QA evaluation benchmark intended to spur research in this area. MLQA contains QA instances in 7 languages, namely English, Arabic, German, Spanish, Hindi, Vietnamese and Simplified Chinese. It consists of over 12K QA instances in English and 5K in each other language, with each QA instance being parallel between 4 languages on average. MLQA is built using a novel alignment context strategy on Wikipedia articles, and serves as a cross-lingual extension to existing extractive QA datasets. We evaluate current state-of-the-art cross-lingual representations on MLQA, and also provide machine-translation-based baselines. In all cases, transfer results are shown to be significantly behind training-language performance.

machine learning, natural language, question answering, (19 more...)

arXiv.org Artificial Intelligence

1910.07475

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
North America > Canada (0.04)
(6 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.87)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)

Add feedback

Fully Quantized Transformer for Improved Translation

Prato, Gabriele, Charlaix, Ella, Rezagholizadeh, Mehdi

arXiv.org Machine LearningOct-16-2019

A BSTRACT State-of-the-art neural machine translation methods employ massive amounts of parameters. Drastically reducing computational costs of such methods without affecting performance has been up to this point unsolved. In this work, we propose a quantization strategy tailored to the Transformer (V aswani et al., 2017) architecture. We evaluate our method on the WMT14 EN-FR and WMT14 EN-DE translation tasks and achieve state-of-the-art quantization results for the Transformer, obtaining no loss in BLEU scores compared to the non-quantized baseline. We further compress the Transformer by showing that, once the model is trained, a good portion of the nodes in the encoder can be removed without causing any loss in BLEU. 1 I NTRODUCTION Neural machine translation methods have achieved impressive results lately (Ahmed et al., 2017; Ott et al., 2018; Edunov et al., 2018). Having been proposed only recently (Kalchbrenner & Blunsom, 2013; Sutskever et al., 2014; Cho et al., 2014), many great work have led the field to move forward quickly. Bahdanau et al. (2014) introduced an attention mechanism, allowing the decoder to attend to any hidden state generated by the encoder. Multiple improvements to their approach have been proposed, such as multiplicative attention (Luong et al., 2015) and more recently multi-head self-attention (V aswani et al., 2017).

arxiv e-print, quantization, transformer, (14 more...)

arXiv.org Machine Learning

1910.10485

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Root Mean Square Layer Normalization

Zhang, Biao, Sennrich, Rico

arXiv.org Machine LearningOct-16-2019

Layer normalization (LayerNorm) has been successfully applied to various deep neural networks to help stabilize training and boost model convergence because of its capability in handling re-centering and re-scaling of both inputs and weight matrix. However, the computational overhead introduced by LayerNorm makes these improvements expensive and significantly slows the underlying network, e.g. RNN in particular. In this paper, we hypothesize that re-centering invariance in LayerNorm is dispensable and propose root mean square layer normalization, or RMSNorm. RMSNorm regularizes the summed inputs to a neuron in one layer according to root mean square (RMS), giving the model re-scaling invariance property and implicit learning rate adaptation ability. RMSNorm is computationally simpler and thus more efficient than LayerNorm. We also present partial RMSNorm, or pRMSNorm where the RMS is estimated from p% of the summed inputs without breaking the above properties. Extensive experiments on several tasks using diverse network architectures show that RMSNorm achieves comparable performance against LayerNorm but reduces the running time by 7%~64% on different models. Source code is available at https://github.com/bzhangGo/rmsnorm.

layernorm, normalization, rmsnorm, (13 more...)

arXiv.org Machine Learning

1910.07467

Country:

North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Spain (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AI could be a force for good – but we're currently heading for a darker future

#artificialintelligenceOct-15-2019, 02:23:08 GMT

Artificial Intelligence (AI) is already re-configuring the world in conspicuous ways. Data drives our global digital ecosystem, and AI technologies reveal patterns in data. Smartphones, smart homes, and smart cities influence how we live and interact, and AI systems are increasingly involved in recruitment decisions, medical diagnoses, and judicial verdicts. Whether this scenario is utopian or dystopian depends on your perspective. The potential risks of AI are enumerated repeatedly.

ai system, training data, translation system, (5 more...)

#artificialintelligence

Industry: Information Technology (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.98)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.57)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.35)

Add feedback