AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Machine Learning for Translation: What's the State of the Language Art? - ReadWrite

#artificialintelligenceNov-4-2019, 08:36:28 GMT

A new batch of Machine Translation tools driven by Artificial Intelligence is already translating tens of millions of messages per day. Proprietary ML translation solutions from Google, Microsoft, and Amazon are in daily use. Facebook takes its road with open-source approaches. What works best for translating software, documentation, and natural language content? And where is the automation of AI-driven neural networks driving? William Mamane, Head of Digital Marketing at Tomedes, a professional language services agency, had been a skeptic of machine translation.

machine translation, neural network, translation, (13 more...)

#artificialintelligence

AI-Alerts: 2019 > 2019-11 > AAAI AI-Alert for Nov 5, 2019 (1.00)

Industry: Information Technology > Services (0.97)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Deciphering The Limitations Of Machine Learning Translations

#artificialintelligenceNov-2-2019, 08:38:53 GMT

Machine learning is offering businesses a new opportunity to translate documents. They can use machine learning to translate marketing materials and other literature. However, these AI solutions may not always be the best. Towards Data Science has discussed this development. The term is called neural machine translation.

machine learning translation, translation, translation method, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.84)

Add feedback

Machine Learning is Fun Part 5: Language Translation with Deep Learning and the Magic of Sequences

#artificialintelligenceNov-1-2019, 20:36:01 GMT

So how do we program a computer to translate human language? The simplest approach is to replace every word in a sentence with the translated word in the target language. This is easy to implement because all you need is a dictionary to look up each word's translation. But the results are bad because it ignores grammar and context. So the next thing you might do is start adding language-specific rules to improve the results.

neural network, training data, translation, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Generating Justifications for Norm-Related Agent Decisions

Kasenberg, Daniel, Roque, Antonio, Thielstrom, Ravenna, Chita-Tegmark, Meia, Scheutz, Matthias

arXiv.org Artificial IntelligenceNov-1-2019

W e present an approach to generating natural language justifications of decisions derived from norm-based reasoning. Assuming an agent which maximally satisfies a set of rules specified in an object-oriented temporal logic, the user can ask factual questions (about the agent's rules, actions, and the extent to which the agent violated the rules) as well as "why" questions that require the agent comparing actual behavior to counterfactual trajectories with respect to these rules. To produce natural-sounding explanations, we focus on the subproblem of producing natural language clauses from statements in a fragment of temporal logic, and then describe how to embed these clauses into explanatory sentences. W e use a human judgment evaluation on a testbed task to compare our approach to variants in terms of intelligibility, mental model and perceived trust.

agent, explanation, predicate, (16 more...)

arXiv.org Artificial Intelligence

1911.00226

Country: Europe > Netherlands (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.48)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.46)
(2 more...)

Add feedback

Pseudolikelihood Reranking with Masked Language Models

Salazar, Julian, Liang, Davis, Nguyen, Toan Q., Kirchhoff, Katrin

arXiv.org Machine LearningOct-31-2019

We rerank with scores from pretrained masked language models like BERT to improve ASR and NMT performance. These log-pseudolikelihood scores (LPLs) can outperform large, autoregressive language models (GPT -2) in out-of-the-box scoring. RoBERTa reduces WER by up to 30% relative on an end-to-end LibriSpeech system and adds up to 1.7 BLEU on state-of-the-art baselines for TED Talks low-resource pairs, with further gains from domain adaptation. In the multilingual setting, a single XLM can be used to rerank translation outputs in multiple languages. The numerical and qualitative properties of LPL scores suggest that LPLs capture sentence fluency better than autoregressive scores. Finally, we finetune BERT to estimate sentence LPLs without masking, enabling scoring in a single, non-recurrent inference pass.

bert, language model, machine translation, (14 more...)

arXiv.org Machine Learning

1910.14659

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.64)

Industry: Education (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Ordering Matters: Word Ordering Aware Unsupervised NMT

Banerjee, Tamali, Murthy, Rudra V, Bhattacharyya, Pushpak

arXiv.org Machine LearningOct-30-2019

Specifically, given an input sentence of length n, the model applies n/2 random swaps between consecutive words and trains the denoising-based U-NMT model (Artetxe, Labaka, and Agirre 2018). Though effective, applying denoising strategy on every sentence in the training data leads to uncertainty in the model thereby, limiting the benefits from the denoising-based U-NMT model. In this paper, we propose a simple fine-tuning strategy where we fine-tune the trained denoising-based U-NMT system without the de-noising strategy. The input sentences are presented as is i.e., without any shuffling noise added. We observe significant improvements in translation performance on many language pairs from our fine-tuning strategy. Our analysis reveals that our proposed models lead to increase in higher n-gram BLEU score compared to the denoising U-NMT models. 1 Introduction Unsupervised Neural Machine Translation (U-NMT) systems (Lample et al. 2018; Artetxe, Labaka, and Agirre 2018; 2019; Wu, Wang, and Wang 2019) typically train an encoder-decoder model for machine translation task using the monolingual data available in the two languages (l 1, l 2). The model proposed by Artetxe, Labaka, and Agirre 2018 consists of a shared encoder and language specific decoders.

machine translation, source sentence, translation, (16 more...)

arXiv.org Machine Learning

1911.01212

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > India (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MLguru #15: The State of ML Frameworks, Machine Translation, and PyTorch 1.3

#artificialintelligenceOct-29-2019, 13:30:02 GMT

Are you a Machine Learning pro already? We are hiring for the position of Senior Machine Learning Engineer. Join the team and help us empower international clients like Volkswagen, IKEA or Keller Williams, as well as startups and industry innovators. Visit our job posting to find out how we can help your career.

machine translation, ml framework, pytorch 1, (1 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Lewis, Mike, Liu, Yinhan, Goyal, Naman, Ghazvininejad, Marjan, Mohamed, Abdelrahman, Levy, Omer, Stoyanov, Ves, Zettlemoyer, Luke

arXiv.org Machine LearningOct-29-2019

BART is trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. It uses a standard Tranformer-based neural machine translation architecture which, despite its simplicity, can be seen as generalizing BERT (due to the bidirectional encoder), GPT (with the left-to-right decoder), and many other more recent pretraining schemes. We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token. BART is particularly effective when fine tuned for text generation but also works well for comprehension tasks. It matches the performance of RoBERTa with comparable training resources on GLUE and SQuAD, achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains of up to 6 ROUGE. BART also provides a 1.1 BLEU increase over a back-translation system for machine translation, with only target language pretraining. We also report ablation experiments that replicate other pretraining schemes within the BART framework, to better measure which factors most influence end-task performance.

arxiv preprint arxiv, bart, decoder, (14 more...)

arXiv.org Machine Learning

1910.13461

Country:

Asia > Middle East > Syria (0.28)
Asia > Middle East > Republic of Türkiye (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(4 more...)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The Automated Copywriter: Algorithmic Rephrasing of Health-Related Advertisements to Improve their Performance

Youngmann, Brit, Gilad-Bachrach, Ran, Karmon, Danny, Yom-Tov, Elad

arXiv.org Artificial IntelligenceOct-27-2019

Search advertising is one of the most commonly-used methods of advertising. Past work has shown that search advertising can be employed to improve health by eliciting positive behavioral change. However, writing effective advertisements requires expertise and (possible expensive) experimentation, both of which may not be available to public health authorities wishing to elicit such behavioral changes, especially when dealing with a public health crises such as epidemic outbreaks. Here we develop an algorithm which builds on past advertising data to train a sequence-to-sequence Deep Neural Network which "translates" advertisements into optimized ads that are more likely to be clicked. The network is trained using more than 114 thousands ads shown on Microsoft Advertising. We apply this translator to two health related domains: Medical Symptoms (MS) and Preventative Healthcare (PH) and measure the improvements in click-through rates (CTR). Our experiments show that the generated ads are predicted to have higher CTR in 81% of MS ads and 76% of PH ads. To understand the differences between the generated ads and the original ones we develop estimators for the affective attributes of the ads. We show that the generated ads contain more calls-to-action and that they reflect higher valence (36% increase) and higher arousal (87%) on a sample of 1000 ads. Finally, we run an advertising campaign where 10 random ads and their rephrased versions from each of the domains are run in parallel. We show an average improvement in CTR of 68% for the generated ads compared to the original ads. Our results demonstrate the ability to automatically optimize advertisement for the health domain. We believe that our work offers health authorities an improved ability to help nudge people towards healthier behaviors while saving the time and cost needed to optimize advertising campaigns.

advertisement, original ad, valence score, (13 more...)

arXiv.org Artificial Intelligence

1910.12274

Country:

North America > United States (0.46)
Asia > Middle East > Israel (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Marketing (1.00)
Information Technology > Services (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(2 more...)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Firefox will soon be able to translate web pages live (and no, it won't use Google)

#artificialintelligenceOct-26-2019, 06:16:37 GMT

Firefox will soon be able to translate web pages into other languages – and will do so without using any third-party cloud-based services such as Google Translate or Bing Translator. Instead, the translation will happen entirely on your own device, which is in keeping with Mozilla's stated aim to let users keep control of their data (in this case, their identity and the content of the web pages they're viewing), and will keep costs down as there's no need for external processing. As ZDNet reports, this will be made possible by a translation library being developed as part of The Bergamot Project, which is dedicated to developing and improving client-side translation using machine learning. The Bergamot Project received a grant of €3 million (about $3.3 million / £2.6 million / AU$4.9 million) from the EU earlier this year to increase the uptake of language technologies in situations where confidentiality is essential. Mozilla has considered adding translation to Firefox before, but scrapped the idea due to the costs involved.

firefox, translation, use google, (2 more...)

#artificialintelligence

Technology:

Information Technology > Communications > Web (0.89)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.83)

Add feedback