AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

The Impact of Annotation Guidelines and Annotated Data on Extracting App Features from App Reviews

Shah, Faiz Ali, Sirts, Kairit, Pfahl, Dietmar

arXiv.org Machine LearningOct-11-2018

Annotation guidelines used to guide the annotation of training and evaluation datasets can have a considerable impact on the quality of machine learning models. In this study, we explore the effects of annotation guidelines on the quality of app feature extraction models. As a main result, we propose several changes to the existing annotation guidelines with a goal of making the extracted app features more useful and informative to the app developers. We test the proposed changes via simulating the application of the new annotation guidelines and then evaluating the performance of the supervised machine learning models trained on datasets annotated with initial and simulated guidelines. While the overall performance of automatic app feature extraction remains the same as compared to the model trained on the dataset with initial annotations, the features extracted by the model trained on the dataset with simulated new annotations are less noisy and more informative to the app developers. Secondly, we are interested in what kind of annotated training data is necessary for training an automatic app feature extraction model. In particular, we explore whether the training set should contain annotated app reviews from those apps/app categories on which the model is subsequently planned to be applied, or is it sufficient to have annotated app reviews from any app available for training, even when these apps are from very different categories compared to the test app. Our experiments show that having annotated training reviews from the test app is not necessary although including them into training set helps to improve recall. Furthermore, we test whether augmenting the training set with annotated product reviews helps to improve the performance of app feature extraction. We find that the models trained on augmented training set lead to improved recall but at the cost of the drop in precision.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1810.05187

Country: Europe (0.46)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Software (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.46)
(3 more...)

Add feedback

Exploring the Use of Attention within an Neural Machine Translation Decoder States to Translate Idioms

Salton, Giancarlo D., Ross, Robert J., Kelleher, John D.

arXiv.org Machine LearningOct-10-2018

Idioms pose problems to almost all Machine Translation systems. This type of language is very frequent in day-to-day language use and cannot be simply ignored. The recent interest in memory augmented models in the field of Language Modelling has aided the systems to achieve good results by bridging long-distance dependencies. In this paper we explore the use of such techniques into a Neural Machine Translation system to help in translation of idiomatic language.

machine learning, natural language, translation, (18 more...)

arXiv.org Machine Learning

1810.06695

Country:

Europe > United Kingdom (0.46)
North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

End-to-End Content and Plan Selection for Data-to-Text Generation

Gehrmann, Sebastian, Dai, Falcon Z., Elder, Henry, Rush, Alexander M.

arXiv.org Artificial IntelligenceOct-10-2018

Learning to generate fluent natural language from structured data with neural networks has become an common approach for NLG. This problem can be challenging when the form of the structured data varies between examples. This paper presents a survey of several extensions to sequence-to-sequence models to account for the latent content selection process, particularly variants of copy attention and coverage decoding. We further propose a training method based on diverse ensembling to encourage models to learn distinct sentence templates during training. An empirical evaluation of these techniques shows an increase in the quality of generated text across five automated metrics, as well as human evaluation.

arxiv preprint arxiv, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

1810.047

Country:

Europe (0.46)
North America (0.28)

Genre:

Overview (0.88)
Research Report (0.82)

Industry: Consumer Products & Services > Restaurants (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Understanding the Origins of Bias in Word Embeddings

Brunet, Marc-Etienne, Alkalay-Houlihan, Colleen, Anderson, Ashton, Zemel, Richard

arXiv.org Machine LearningOct-8-2018

The power of machine learning systems not only promises great technical progress, but risks societal harm. As a recent example, researchers have shown that popular word embedding algorithms exhibit stereotypical biases, such as gender bias. The widespread use of these algorithms in machine learning systems, from automated translation services to curriculum vitae scanners, can amplify stereotypes in important contexts. Although methods have been developed to measure these biases and alter word embeddings to mitigate their biased representations, there is a lack of understanding in how word embedding bias depends on the training data. In this work, we develop a technique for understanding the origins of bias in word embeddings. Given a word embedding trained on a corpus, our method identifies how perturbing the corpus will affect the bias of the resulting embedding. This can be used to trace the origins of word embedding bias back to the original training documents. Using our method, one can investigate trends in the bias of the underlying corpus and identify subsets of documents whose removal would most reduce bias. We demonstrate our techniques on both a New York Times and Wikipedia corpus and find that our influence function-based approximations are extremely accurate.

corpus, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1810.03611

Country:

Europe (0.46)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.47)

Industry:

Health & Medicine (0.68)
Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.34)

Add feedback

Google Translate for iOS can speak in your local accent

EngadgetOct-3-2018, 18:50:54 GMT

Until now, using Google Translate on your iPhone has meant listening to the same pronunciation for translations no matter where you live. That's not very considerate, and potentially a problem if you live in countries where foreign accents could make comprehension difficult. You won't have that issue from now on -- an update to Google Translate has added speech output in local versions of multiple languages, including English, Bengali, French and Spanish. You can hear English results with an Indian accent, for instance, or listen to French with a Canadian spin. Android has included these speech options for a while.

artificial intelligence, google translate, natural language, (2 more...)

Engadget

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.95)
Information Technology > Communications > Mobile (0.68)

Add feedback

Optimal Completion Distillation for Sequence Learning

Sabour, Sara, Chan, William, Norouzi, Mohammad

arXiv.org Machine LearningOct-2-2018

We present Optimal Completion Distillation (OCD), a training procedure for optimizing sequence to sequence models based on edit distance. OCD is efficient, has no hyper-parameters of its own, and does not require pretraining or joint optimization with conditional log-likelihood. Given a partial sequence generated by the model, we first identify the set of optimal suffixes that minimize the total edit distance, using an efficient dynamic programming algorithm. Then, for each position of the generated sequence, we use a target distribution that puts equal probability on the first token of all the optimal suffixes. OCD achieves the state-of-the-art performance on end-to-end speech recognition, on both Wall Street Journal and Librispeech datasets, achieving $9.3\%$ WER and $4.5\%$ WER respectively.

edit distance, prefixe, sequence, (14 more...)

arXiv.org Machine Learning

1810.01398

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)

Add feedback

AI Translation: Latest Trends - Text United

#artificialintelligenceOct-1-2018, 15:16:50 GMT

It is not out of reason to boldly say that translation is of great importance to man. The diversity of languages and cultures in the world makes translation essential to humanity. The benefits of translation to humankind spread across businesses, politics, international relations, tourism, and education. Any company can go global. Moreover, the secret of a successful international business lies in quality translation services.

artificial intelligence, natural language, translation, (11 more...)

#artificialintelligence

Industry: Government (0.35)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles

Shi, Tianze, Tatwawadi, Kedar, Chakrabarti, Kaushik, Mao, Yi, Polozov, Oleksandr, Chen, Weizhu

arXiv.org Artificial IntelligenceOct-1-2018

We present a sequence-to-action parsing approach for the natural language to SQL task that incrementally fills the slots of a SQL query with feasible actions from a pre-defined inventory. To account for the fact that typically there are multiple correct SQL queries with the same or very similar semantics, we draw inspiration from syntactic parsing techniques and propose to train our sequence-to-action models with non-deterministic oracles. We evaluate our models on the WikiSQL dataset and achieve an execution accuracy of 83.7% on the test set, a 2.1% absolute improvement over the models trained with traditional static oracles assuming a single correct target SQL query. When further combined with the execution-guided decoding strategy, our model sets a new state-of-the-art performance at an execution accuracy of 87.1%.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

1809.05054

Country: North America (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Add feedback

Is Neural Machine Translation Ready for Marketing Content?

#artificialintelligenceSep-30-2018, 12:51:55 GMT

Music fans were the first to prove this by making a laughingstock of the app by loading lyrics from songs like Will Smith's "Fresh Prince of Bel-Air" and the theme song from Moana to see what funny or ridiculous translations Google would generate. While the tool isn't nearly as bad as videos make it out to be, this negative PR has kept companies from using it. After all, if Google can't translate song lyrics correctly, why would you trust it with marketing content? But Google Translate doesn't represent all machine translation. However, it is a brand that happens to be well-known and free.

artificial intelligence, machine translation, natural language, (7 more...)

#artificialintelligence

Industry: Marketing (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

SwiftKey for Android now offers real-time message translation

EngadgetSep-27-2018, 21:37:27 GMT

Microsoft has brought its Translator to SwiftKey, allowing users to translate their conversations without having to leave the app they're in. With an update out today, SwiftKey for Android will translate incoming and outgoing messages in real time and it will be able to do so for over 60 languages. Additionally, while you won't need to install Microsoft Translator to be able to use the new SwiftKey feature, the company says translation will work offline if you do. Microsoft purchased SwiftKey in 2016 and it only makes sense that it would merge it's translator with the smart keyboard. Android users can access the feature through SwiftKey's Toolbar -- just tap the plus sign in the upper left corner of the keyboard to get there -- and you can check out which languages are supported here.

artificial intelligence, natural language, real time system, (7 more...)

Engadget

Technology:

Information Technology > Communications > Mobile (0.93)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.67)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)
Information Technology > Architecture > Real Time Systems (0.67)

Add feedback