Goto

Collaborating Authors

 Machine Translation


Can Transformers Jump Around Right in Natural Language? Assessing Performance Transfer from SCAN

arXiv.org Artificial Intelligence

Despite their practical success, modern seq2seq architectures are unable to generalize systematically on several SCAN tasks. Hence, it is not clear if SCAN-style compositional generalization is useful in realistic NLP tasks. In this work, we study the benefit that such compositionality brings about to several machine translation tasks. We present several focused modifications of Transformer that greatly improve generalization capabilities on SCAN and select one that remains on par with a vanilla Transformer on a standard machine translation (MT) task. Next, we study its performance in low-resource settings and on a newly introduced distribution-shifted English-French translation task. Overall, we find that improvements of a SCAN-capable model do not directly transfer to the resource-rich MT setup. In contrast, in the low-resource setup, general modifications lead to an improvement of up to 13.1% BLEU score w.r.t. a vanilla Transformer. Similarly, an improvement of 14% in an accuracy-based metric is achieved in the introduced compositional English-French translation task. This provides experimental evidence that the compositional generalization assessed in SCAN is particularly useful in resource-starved and domain-shifted scenarios.


KantanStream Meets the Challenge of Big Data and Wins

#artificialintelligence

One of the wonders of the modern I.T. era is the extent to which technology has shrunken this world. Artificial Intelligence (AI) has given industry a global reach that can be traversed in mere nanoseconds. It is a brave new world. A reality that would have been the stuff of science fiction only a few short decades ago. I am off the vintage where myself and my co-workers had neither the worldwide web nor email to communicate with, and the cloud was just a fluffy white thing in the sky.


Zoom is buying a startup to bring real-time translation to video calls

Engadget

Zoom announced today it plans to acquire Karlsruhe Information Technology, a German startup that specializes in machine learning-based real-time translation. Also known as Kites, the company is made up of about a dozen researchers with ties to the Karlsruhe Institute of Technology. Zoom didn't share the financial terms of the deal, but did disclose that the startup will help it bring machine translation features to its platform. Moving forward, Zoom says it may also establish a research and development center in Germany. "We are continuously looking for new ways to deliver happiness to our users and improve meeting productivity, and [machine translation] solutions will be key in enhancing our platform for Zoom customers across the globe," said Velchamy Sankarlingam, president of product and engineering at Zoom.


Rethinking the Evaluation of Neural Machine Translation

arXiv.org Artificial Intelligence

The evaluation of neural machine translation systems is usually built upon generated translation of a certain decoding method (e.g., beam search) with evaluation metrics over the generated translation (e.g., BLEU). However, this evaluation framework suffers from high search errors brought by heuristic search algorithms and is limited by its nature of evaluation over one best candidate. In this paper, we propose a novel evaluation protocol, which not only avoids the effect of search errors but provides a system-level evaluation in the perspective of model ranking. In particular, our method is based on our newly proposed exact top-$k$ decoding instead of beam search. Our approach evaluates model errors by the distance between the candidate spaces scored by the references and the model respectively. Extensive experiments on WMT'14 English-German demonstrate that bad ranking ability is connected to the well-known beam search curse, and state-of-the-art Transformer models are facing serious ranking errors. By evaluating various model architectures and techniques, we provide several interesting findings. Finally, to effectively approximate the exact search algorithm with same time cost as original beam search, we present a minimum heap augmented beam search algorithm.


Neural Machine Translation for Low-Resource Languages: A Survey

arXiv.org Artificial Intelligence

Neural Machine Translation (NMT) has seen a tremendous spurt of growth in less than ten years, and has already entered a mature phase. While considered as the most widely used solution for Machine Translation, its performance on low-resource language pairs still remains sub-optimal compared to the high-resource counterparts, due to the unavailability of large parallel corpora. Therefore, the implementation of NMT techniques for low-resource language pairs has been receiving the spotlight in the recent NMT research arena, thus leading to a substantial amount of research reported on this topic. This paper presents a detailed survey of research advancements in low-resource language NMT (LRL-NMT), along with a quantitative analysis aimed at identifying the most popular solutions. Based on our findings from reviewing previous work, this survey paper provides a set of guidelines to select the possible NMT technique for a given LRL data setting. It also presents a holistic view of the LRL-NMT research landscape and provides a list of recommendations to further enhance the research efforts on LRL-NMT.


How Daniel Wellington's customer service department saved 99% on translation costs with Amazon Translate

#artificialintelligence

This post is co-authored by Lezgin Bakircioglu, Innovation and Security Manager at Daniel Wellington. In their own words, "Daniel Wellington (DW) is a Swedish fashion brand founded in 2011. Since its inception, it has sold over 11 million watches and established itself as one of the fastest-growing and most coveted brands in the industry." In this post, we share how DW saved 99% on translation costs with Amazon Translate and other AWS services. At DW, having the ability to respond to customers in their local language is critical to the customer journey.


10 Best African Language Datasets for Data Science Projects

#artificialintelligence

Africa has over 2000 languages, but these languages are not well-represented in the existing Natural Language Processing ecosystem. One challenge is the lack of useful African language datasets that we can use to solve different social and economic problems. In this article, I have compiled a list of African language datasets from across the web. You can use these datasets in various NLP tasks such as text classification, named entity recognition, machine translation, sentiment analysis, speech recognition, and topic modeling. I've made this collection of datasets public to give you an opportunity to use your skills and help solve different challenges.


Power Law Graph Transformer for Machine Translation and Representation Learning

arXiv.org Artificial Intelligence

We present the Power Law Graph Transformer, a transformer model with well defined deductive and inductive tasks for prediction and representation learning. The deductive task learns the dataset level (global) and instance level (local) graph structures in terms of learnable power law distribution parameters. The inductive task outputs the prediction probabilities using the deductive task output, similar to a transductive model. We trained our model with Turkish-English and Portuguese-English datasets from TED talk transcripts for machine translation and compared the model performance and characteristics to a transformer model with scaled dot product attention trained on the same experimental setup. We report BLEU scores of $17.79$ and $28.33$ on the Turkish-English and Portuguese-English translation tasks with our model, respectively. We also show how a duality between a quantization set and N-dimensional manifold representation can be leveraged to transform between local and global deductive-inductive outputs using successive application of linear and non-linear transformations end-to-end.


Artificial Neural Network is Revolutionizing The Future of the Translation Industry

#artificialintelligence

Do you know that a full-time working translator can translate approximately 520,000 words per year? There would be no wrong in saying that the translation industry has existed for centuries and will progress in double digits in the upcoming years. Because digital realms continuously push for more shared and globalized experiences, the current worth of the global translation industry is $56.1 billion, and the figure is expected to increase at a swift pace in upcoming years. The number is projected to surpass $70 billion by the year 2023. It's been more than 10 years since the launch of Google translate by utilizing phase-based machine translation algorithms.


Phrase-level Active Learning for Neural Machine Translation

arXiv.org Artificial Intelligence

Neural machine translation (NMT) is sensitive to domain shift. In this paper, we address this problem in an active learning setting where we can spend a given budget on translating in-domain data, and gradually fine-tune a pre-trained out-of-domain NMT model on the newly translated data. Existing active learning methods for NMT usually select sentences based on uncertainty scores, but these methods require costly translation of full sentences even when only one or two key phrases within the sentence are informative. To address this limitation, we re-examine previous work from the phrase-based machine translation (PBMT) era that selected not full sentences, but rather individual phrases. However, while incorporating these phrases into PBMT systems was relatively simple, it is less trivial for NMT systems, which need to be trained on full sequences to capture larger structural properties of sentences unique to the new domain. To overcome these hurdles, we propose to select both full sentences and individual phrases from unlabelled data in the new domain for routing to human translators. In a German-English translation task, our active learning approach achieves consistent improvements over uncertainty-based sentence selection methods, improving up to 1.2 BLEU score over strong active learning baselines.