AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

How can AI Automate End-to-End Data Science?

Aggarwal, Charu, Bouneffouf, Djallel, Samulowitz, Horst, Buesser, Beat, Hoang, Thanh, Khurana, Udayan, Liu, Sijia, Pedapati, Tejaswini, Ram, Parikshit, Rawat, Ambrish, Wistuba, Martin, Gray, Alexander

arXiv.org Artificial IntelligenceOct-22-2019

Data science is labor-intensive and human experts are scarce but heavily involved in every aspect of it. This makes data science time consuming and restricted to experts with the resulting quality heavily dependent on their experience and skills. To make data science more accessible and scalable, we need its democratization. Automated Data Science (AutoDS) is aimed towards that goal and is emerging as an important research and business topic. We introduce and define the AutoDS challenge, followed by a proposal of a general AutoDS framework that covers existing approaches but also provides guidance for the development of new methods. We categorize and review the existing literature from multiple aspects of the problem setup and employed techniques. Then we provide several views on how AI could succeed in automating end-to-end AutoDS. We hope this survey can serve as insightful guideline for the AutoDS field and provide inspiration for future research.

architecture, data science, learning, (15 more...)

arXiv.org Artificial Intelligence

1910.14436

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > Macao (0.04)
Asia > China (0.04)
(5 more...)

Genre:

Research Report (0.40)
Overview (0.34)

Industry:

Leisure & Entertainment > Games (0.68)
Health & Medicine (0.68)

Technology:

Information Technology > Data Science > Data Integration (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
(3 more...)

Add feedback

AI could be a force for positive social change – but we're currently heading for a darker future

#artificialintelligenceOct-20-2019, 11:03:32 GMT

Artificial intelligence (AI) is already re-configuring the world in conspicuous ways. Data drives our global digital ecosystem, and AI technologies reveal patterns in data. Smartphones, smart homes, and smart cities influence how we live and interact, and AI systems are increasingly involved in recruitment decisions, medical diagnoses, and judicial verdicts. Whether this scenario is utopian or dystopian depends on your perspective. The potential risks of AI are enumerated repeatedly.

ai system, training data, translation system, (5 more...)

#artificialintelligence

Industry: Information Technology (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.98)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.57)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.35)

Add feedback

Byte-Pair Encoding for Text-to-SQL Generation

Müller, Samuel, Vlachos, Andreas

arXiv.org Machine LearningOct-20-2019

Neural sequence-to-sequence models provide a competitive approach to the task of mapping a question in natural language to an SQL query, also referred to as text-to-SQL generation. The Byte-Pair Encoding algorithm (BPE) has previously been used to improve machine translation (MT) between natural languages. In this work, we adapt BPE for text-to-SQL generation. As the datasets for this task are rather small compared to MT, we present a novel stopping criterion that prevents overfitting the BPE encoding to the training set. Additionally, we present AST BPE, which is a version of BPE that uses the Abstract Syntax Tree (AST) of the SQL statement to guide BPE merges and therefore produce BPE encodings that generalize better. W e improved the accuracy of a strong attentive seq2seq baseline on five out of six English text-to-SQL tasks while reducing training time by more than 50% on four of them due to the shortened targets. Finally, on two of these tasks we exceeded previously reported accuracies.

bpe, dataset, query, (15 more...)

arXiv.org Machine Learning

1910.08962

Country:

North America > United States > Alabama (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Geneva > Geneva (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

XL-Editor: Post-editing Sentences with XLNet

Shih, Yong-Siang, Chang, Wei-Cheng, Yang, Yiming

arXiv.org Machine LearningOct-19-2019

While neural sequence generation models achieve initial su c-cess for many NLP applications, the canonical decoding procedure with left-to-right generation order (i.e., autoreg res-sive) in one-pass can not reflect the true nature of human revising a sentence to obtain a refined result. In this work, we propose XL-Editor, a novel training framework that enables state-of-the-art generalized autoregressive pretrainin g methods, XLNet specifically, to revise a given sentence by the variable-length insertion probability. Concretely, XL-E ditor can (1) estimate the probability of inserting a variable-le ngth sequence into a specific position of a given sentence; (2) execute post-editing operations such as insertion, deletion, and replacement based on the estimated variable-length insert ion probability; (3) complement existing sequence-to-sequen ce models to refine the generated sequences. Empirically, we first demonstrate better post-editing capabilities of XL-E ditor over XLNet on the text insertion and deletion tasks, which validates the effectiveness of our proposed framework. Fur - thermore, we extend XL-Editor to the unpaired text style transfer task, where transferring the target style onto a gi ven sentence can be naturally viewed as post-editing the senten ce into the target style. XL-Editor achieves significant impro ve-ment in style transfer accuracy and also maintains coherent semantic of the original sentence, showing the broad applic ability of our method.

probability, sequence, xl-editor, (14 more...)

arXiv.org Machine Learning

1910.10479

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Samoa (0.04)
Oceania > New Zealand (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Facebook makes big advances in AI reasoning and machine translation - SiliconANGLE

#artificialintelligenceOct-17-2019, 14:37:50 GMT

Facebook Inc. is using its @Scale conference today to provide an update on its progress in artificial intelligence research. The social media company is open-sourcing a new "AI reasoning" platform and providing some updates on its research into machine translation. It's part of a broad push to scale up AI workloads, a difficult task given the massive amounts of data needed to train AI models, Srinivas Narayanan (pictured), the lead for Facebook's Applied AI Research, said this morning at the conference in San Jose, California. "Facebook wouldn't be where it is today without AI," Narayanan said. "It's deeply integrated into everything we do."

facebook, monolingual data, translation, (16 more...)

#artificialintelligence

Country: North America > United States > California > Santa Clara County > San Jose (0.25)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

A language processing algorithm for predicting tactical solutions to an operational planning problem under uncertainty

Frejinger, Emma, Larsen, Eric

arXiv.org Machine LearningOct-17-2019

This paper is devoted to the prediction of solutions to a stochastic discrete optimization problem. Through an application, we illustrate how we can use a state-of-the-art neural machine translation (NMT) algorithm to predict the solutions by defining appropriate vocabularies, syntaxes and constraints. We attend to applications where the predictions need to be computed in very short computing time -- in the order of milliseconds or less. The results show that with minimal adaptations to the model architecture and hyperparameter tuning, the NMT algorithm can produce accurate solutions within the computing time budget. While these predictions are slightly less accurate than approximate stochastic programming solutions (sample average approximation), they can be computed faster and with less variability.

approximator, container, sequence, (17 more...)

arXiv.org Machine Learning

1910.08216

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > Canada > Quebec > Montreal (0.04)
North America > United States > California > Monterey County > Monterey (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Overcoming the Rare Word Problem for Low-Resource Language Pairs in Neural Machine Translation

Ngo, Thi-Vinh, Ha, Thanh-Le, Nguyen, Phuong-Thai, Nguyen, Le-Minh

arXiv.org Machine LearningOct-17-2019

Among the six challenges of neural machine translation (NMT) coined by ( Koehn and Knowles, 2017), rare-word problem is considered the most severe one, especially in translation of low-resource languages. In this paper, we propose three solutions to address the rare words in neural machine translation systems. First, we enhance source context to predict the target words by connecting directly the source embeddings to the output of the attention component in NMT. Second, we propose an algorithm to learn morphology of unknown words for English in supervised way in order to minimize the adverse effect of rare-word problem. Finally, we exploit synonymous relation from the W ordNet to overcome out-of-vocabulary (OOV) problem of NMT. W e evaluate our approaches on two low-resource language pairs: English-Vietnamese and Japanese-Vietnamese. In our experiments, we have achieved significant improvements of up to roughly 1.0 BLEU points in both language pairs.

machine translation, proceedings, translation, (13 more...)

arXiv.org Machine Learning

1910.03467

Country:

Asia > Vietnam > Thái Nguyên Province > Thái Nguyên (0.05)
North America > Canada (0.04)
Europe > Spain (0.04)
(7 more...)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Translation by the numbers: Facebook AI puts words into multidimensional spaces

The Japan TimesOct-16-2019, 09:20:24 GMT

PARIS – Designers of machine translation tools still mostly rely on dictionaries to make a foreign language understandable. But now there is a new way: numbers. Facebook researchers say rendering words into figures and exploiting mathematical similarities between languages is a promising avenue -- even if a universal communicator as seen in "Star Trek" remains a distant dream. Powerful automatic translation is a big priority for internet giants. Allowing as many people as possible worldwide to communicate is not just an altruistic goal, but also good business.

facebook, multidimensional space, translation, (8 more...)

The Japan Times

Country:

Europe > Spain > Galicia > Madrid (0.06)
Europe > Russia (0.06)
Europe > France (0.06)
(2 more...)

Industry: Information Technology > Services (0.98)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Lost in Translation?

#artificialintelligenceOct-16-2019, 00:41:26 GMT

Fueled by improvements in speech recognition, machine learning, better algorithms, cloud processing, and more powerful computing devices, the quality of machine translations is improving. Learning another language has never been a simple proposition. It can take months of study to absorb the basics and years to become fluent. Of course, there's the added headache that learning a language doesn't help if a person encounters one of the world's other 7,000 or so languages. "There has always been a need for human translators and interpreters," says Andrew Ochoa, CEO of translation technology firm Waverly Labs.

google translate, machine translation, translation, (12 more...)

#artificialintelligence

AI-Alerts: 2019 > 2019-10 > AAAI AI-Alert for Oct 22, 2019 (1.00)

Country:

North America > United States > Oregon > Clackamas County > West Linn (0.05)
North America > United States > Maryland > Prince George's County > College Park (0.05)
Europe > United Kingdom (0.05)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MLQA: Evaluating Cross-lingual Extractive Question Answering

Lewis, Patrick, Oğuz, Barlas, Rinott, Ruty, Riedel, Sebastian, Schwenk, Holger

arXiv.org Artificial IntelligenceOct-16-2019

Question answering (QA) models have shown rapid progress enabled by the availability of large, high-quality benchmark datasets. Such annotated datasets are difficult and costly to collect, and rarely exist in languages other than English, making training QA systems in other languages challenging. An alternative to building large monolingual training datasets is to develop cross-lingual systems which can transfer to a target language without requiring training data in that language. In order to develop such systems, it is crucial to invest in high quality multilingual evaluation benchmarks to measure progress. We present MLQA, a multi-way aligned extractive QA evaluation benchmark intended to spur research in this area. MLQA contains QA instances in 7 languages, namely English, Arabic, German, Spanish, Hindi, Vietnamese and Simplified Chinese. It consists of over 12K QA instances in English and 5K in each other language, with each QA instance being parallel between 4 languages on average. MLQA is built using a novel alignment context strategy on Wikipedia articles, and serves as a cross-lingual extension to existing extractive QA datasets. We evaluate current state-of-the-art cross-lingual representations on MLQA, and also provide machine-translation-based baselines. In all cases, transfer results are shown to be significantly behind training-language performance.

machine learning, natural language, question answering, (19 more...)

arXiv.org Artificial Intelligence

1910.07475

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
North America > Canada (0.04)
(6 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.87)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)

Add feedback