Goto

Collaborating Authors

 Machine Translation


MLPerf Inference Benchmark

arXiv.org Machine Learning

Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and four orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf implements a set of rules and practices to ensure comparability across systems with wildly differing architectures. In this paper, we present the method and design principles of the initial MLPerf Inference release. The first call for submissions garnered more than 600 inference-performance measurements from 14 organizations, representing over 30 systems that show a range of capabilities.


Training Neural Machine Translation (NMT) Models using Tensor Train Decomposition on TensorFlow (T3F)

arXiv.org Machine Learning

Neural Machine Translation (NMT) is a deep learning model that prov ides a robust method for machine translation using recurrent neural ne tworks (RNNs). Originally proposed in [1], NMT relies primarily on an encoder-decoder ar chi-tecture that provides increased fluency over phrase-based sys tems. This was implemented successfully in [2] for fast, accurate use on very large datasets. However, it has been suggested that there is significant redundan cy in the current method of neural network parametrization [3], presenting t he opportunity for significant speedup. Tensor Train (TT) decomposition [4] is a method by which large tenso rs can be approximated by the product of a'train' of smaller matrices (see Section 2.2). 1 TTdecomposition has been proposed as a method of speeding up an d reducing the memory usage of machine translation systems with dense weight matrices by reducing the number of parameters required to describe the sy stem [3].


Machine Learning for Translation: What's the State of the Language Art? - ReadWrite

#artificialintelligence

A new batch of Machine Translation tools driven by Artificial Intelligence is already translating tens of millions of messages per day. Proprietary ML translation solutions from Google, Microsoft, and Amazon are in daily use. Facebook takes its road with open-source approaches. What works best for translating software, documentation, and natural language content? And where is the automation of AI-driven neural networks driving? William Mamane, Head of Digital Marketing at Tomedes, a professional language services agency, had been a skeptic of machine translation.


Machine Learning for Translation: What's the State of the Language Art? - ReadWrite

#artificialintelligence

A new batch of Machine Translation tools driven by Artificial Intelligence is already translating tens of millions of messages per day. Proprietary ML translation solutions from Google, Microsoft, and Amazon are in daily use. Facebook takes its road with open-source approaches. What works best for translating software, documentation, and natural language content? And where is the automation of AI-driven neural networks driving? William Mamane, Head of Digital Marketing at Tomedes, a professional language services agency, had been a skeptic of machine translation.


Deciphering The Limitations Of Machine Learning Translations

#artificialintelligence

Machine learning is offering businesses a new opportunity to translate documents. They can use machine learning to translate marketing materials and other literature. However, these AI solutions may not always be the best. Towards Data Science has discussed this development. The term is called neural machine translation.


Machine Learning is Fun Part 5: Language Translation with Deep Learning and the Magic of Sequences

#artificialintelligence

So how do we program a computer to translate human language? The simplest approach is to replace every word in a sentence with the translated word in the target language. This is easy to implement because all you need is a dictionary to look up each word's translation. But the results are bad because it ignores grammar and context. So the next thing you might do is start adding language-specific rules to improve the results.


Generating Justifications for Norm-Related Agent Decisions

arXiv.org Artificial Intelligence

W e present an approach to generating natural language justifications of decisions derived from norm-based reasoning. Assuming an agent which maximally satisfies a set of rules specified in an object-oriented temporal logic, the user can ask factual questions (about the agent's rules, actions, and the extent to which the agent violated the rules) as well as "why" questions that require the agent comparing actual behavior to counterfactual trajectories with respect to these rules. To produce natural-sounding explanations, we focus on the subproblem of producing natural language clauses from statements in a fragment of temporal logic, and then describe how to embed these clauses into explanatory sentences. W e use a human judgment evaluation on a testbed task to compare our approach to variants in terms of intelligibility, mental model and perceived trust.


Pseudolikelihood Reranking with Masked Language Models

arXiv.org Machine Learning

We rerank with scores from pretrained masked language models like BERT to improve ASR and NMT performance. These log-pseudolikelihood scores (LPLs) can outperform large, autoregressive language models (GPT -2) in out-of-the-box scoring. RoBERTa reduces WER by up to 30% relative on an end-to-end LibriSpeech system and adds up to 1.7 BLEU on state-of-the-art baselines for TED Talks low-resource pairs, with further gains from domain adaptation. In the multilingual setting, a single XLM can be used to rerank translation outputs in multiple languages. The numerical and qualitative properties of LPL scores suggest that LPLs capture sentence fluency better than autoregressive scores. Finally, we finetune BERT to estimate sentence LPLs without masking, enabling scoring in a single, non-recurrent inference pass.


Ordering Matters: Word Ordering Aware Unsupervised NMT

arXiv.org Machine Learning

Specifically, given an input sentence of length n, the model applies n/2 random swaps between consecutive words and trains the denoising-based U-NMT model (Artetxe, Labaka, and Agirre 2018). Though effective, applying denoising strategy on every sentence in the training data leads to uncertainty in the model thereby, limiting the benefits from the denoising-based U-NMT model. In this paper, we propose a simple fine-tuning strategy where we fine-tune the trained denoising-based U-NMT system without the de-noising strategy. The input sentences are presented as is i.e., without any shuffling noise added. We observe significant improvements in translation performance on many language pairs from our fine-tuning strategy. Our analysis reveals that our proposed models lead to increase in higher n-gram BLEU score compared to the denoising U-NMT models. 1 Introduction Unsupervised Neural Machine Translation (U-NMT) systems (Lample et al. 2018; Artetxe, Labaka, and Agirre 2018; 2019; Wu, Wang, and Wang 2019) typically train an encoder-decoder model for machine translation task using the monolingual data available in the two languages (l 1, l 2). The model proposed by Artetxe, Labaka, and Agirre 2018 consists of a shared encoder and language specific decoders.


MLguru #15: The State of ML Frameworks, Machine Translation, and PyTorch 1.3

#artificialintelligence

Are you a Machine Learning pro already? We are hiring for the position of Senior Machine Learning Engineer. Join the team and help us empower international clients like Volkswagen, IKEA or Keller Williams, as well as startups and industry innovators. Visit our job posting to find out how we can help your career.