Goto

Collaborating Authors

 Machine Translation


AI Technologies that are Reshaping Social Infrastructure

#artificialintelligence

Together with the rise of the Internet, access to large repositories of data has helped machine learning technology grow exponentially. The incredibly quick pace of growth was unprecedented. As a result, it is obvious that AI will make a significant impact on the world in the years to come. However, with the numerous established and emerging fields of AI around today, such a blanket statement doesn't provide much concrete meaning. What fields and applications of AI are receiving the most investment and development?


How AI is dominating smartphones and home devices

#artificialintelligence

Google's I/O 2018 asserts one thing – the next wave of smartphones will run on a generous amount of Artificial Intelligence. Even the recent Mobile World Congress (MWC) also had conversations that were largely revolving around Artificial Intelligence. Major smartphones makers, led by Apple, Google, Samsung, and many others are creating operating systems, mobile apps and even smartphone that have Artificial Intelligence at their core. McKinsey Global Institute estimates that the investments in Artificial Intelligence R&D made by tech giants by Google and Baidu to be in the range of $20 Billion to $30 Billion. In fact, Ai is ranked to be one among the 5 disruptive Technologies that are shaping up our future digital landscape.


Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)

#artificialintelligence

Note: The animations below are videos. Touch or hover on them (if you're using a mouse) to get play controls so you can pause if needed. Sequence-to-sequence models are deep learning models that have achieved a lot of success in tasks like machine translation, text summarization, and image captioning. Google Translate started using such a model in production in late 2016. These models are explained in the two pioneering papers (Sutskever et al., 2014, Cho et al., 2014).


A Comprehensive Survey of Multilingual Neural Machine Translation

arXiv.org Artificial Intelligence

We present a survey on multilingual neural machine translation (MNMT), which has gained a lot of traction in the recent years. MNMT has been useful in improving translation quality as a result of translation knowledge transfer (transfer learning). MNMT is more promising and interesting than its statistical machine translation counterpart because end-to-end modeling and distributed representations open new avenues for research on machine translation. Many approaches have been proposed in order to exploit multilingual parallel corpora for improving translation quality. However, the lack of a comprehensive survey makes it difficult to determine which approaches are promising and hence deserve further exploration. In this paper, we present an in-depth survey of existing literature on MNMT. We first categorize various approaches based on their central use-case and then further categorize them based on resource scenarios, underlying modeling principles, core-issues and challenges. Wherever possible we address the strengths and weaknesses of several techniques by comparing them with each other. We also discuss the future directions that MNMT research might take. This paper is aimed towards both, beginners and experts in NMT. We hope this paper will serve as a starting point as well as a source of new ideas for researchers and engineers interested in MNMT.


Learning Accurate Integer Transformer Machine-Translation Models

arXiv.org Machine Learning

We describe a method for training accurate Transformer machine-translation models to run inference using 8-bit integer (INT8) hardware matrix multipliers, as opposed to the more costly single-precision floating-point (FP32) hardware. Unlike previous work, which converted only 85 Transformer matrix multiplications to INT8, leaving 48 out of 133 of them in FP32 because of unacceptable accuracy loss, we convert them all to INT8 without compromising accuracy. Tested on the new-stest2014 English-to-German translation task, our INT8 Transformer Base and Transformer Big models yield BLEU scores that are 99.3% to 100% relative to those of the corresponding FP32 models. Our approach converts all matrix-multiplication tensors from an existing FP32 model into INT8 tensors by automatically making range-precision tradeoffs during training. To demonstrate the robustness of this approach, we also include results from INT6 Transformer models. 1 Introduction We report a method for training accurate yet compact Transformer machine-translation models [ V aswaniet al., 2017 ] . Specifically, we aim these models at hardware with 8-bit integer (INT8) matrix multipliers. Compared to single-precision floating-point (FP32) matrix multiplications, INT8 matrix multiplications not only reduce both storage and bandwidth four times, but they also consume 15 times less energy [ Horowitz, 2014 ] .


New eBay platform using AI to enable image search and internal innovation

#artificialintelligence

Many of the biggest tech companies like Google, Facebook and Amazon have realized the value of creating their own AI platforms for both internal and customer-facing services. Facebook's FBLearner Flow helps the social media site filter out offensive posts, while Uber's Michelangelo gives users time predictions for food deliveries. To keep up with the competition, eBay has unveiled its AI platform, Krylov, which has given the company a wide range of new capabilities from improved language translation services to searching with images. In a blog post, eBay's Sanjeev Katariya, vice president and chief architect of the eBay AI and platforms, and Ashok Ramani, director of product management, computer vision, natural and language processing, discussed the creation of Krylov and how it has changed things both inside eBay and for users of the site. "With computer vision powered by eBay's modern AI platform, the technology helps you find items based on the click of your camera or an image. Users can go onto the eBay app and take a photo of what they are looking for and within milliseconds, the platform surfaces items that match the image," Katariya and Ramani wrote in December.


Machine Learning Packs an Economic Punch: eBay's Sharp Increase in International Commerce

#artificialintelligence

A new study co-authored by an MIT economist shows that improved translation software can significantly boost international trade online -- a notable case of machine learning having a clear impact on economic activity. The research finds that after eBay improved its automatic translation program in 2014, commerce shot up by 10.9 percent among pairs of countries where people could use the new system. To have it be so clear in such a short amount of time really says a lot about the power of this technology," says Erik Brynjolfsson, an MIT economist and co-author of a new paper detailing the results. To put the results in perspective, he adds, consider that physical distance is, by itself, also a significant barrier to global commerce. The 10.9 percent change generated by eBay's new translation software increases trade by the same amount as "making the world 26 percent smaller, in terms of its impact on the goods that we studied," he says. The paper, "Does Machine Translation Affect International Trade?


Stand-Alone Self-Attention in Vision Models

Neural Information Processing Systems

Convolutions are a fundamental building block of modern computer vision systems. Recent approaches have argued for going beyond convolutions in order to capture long-range dependencies. These efforts focus on augmenting convolutional models with content-based interactions, such as self-attention and non-local means, to achieve gains on a number of vision tasks. The natural question that arises is whether attention can be a stand-alone primitive for vision models instead of serving as just an augmentation on top of convolutions. In developing and testing a pure self-attention vision model, we verify that self-attention can indeed be an effective stand-alone layer. A simple procedure of replacing all instances of spatial convolutions with a form of self-attention to ResNet-50 produces a fully self-attentional model that outperforms the baseline on ImageNet classification with 12% fewer FLOPS and 29% fewer parameters. On COCO object detection, a fully self-attention model matches the mAP of a baseline RetinaNet while having 39% fewer FLOPS and 34% fewer parameters. Detailed ablation studies demonstrate that self-attention is especially impactful when used in later layers. These results establish that stand-alone self-attention is an important addition to the vision practitioner's toolbox.


Learning from Learning Machines: Optimisation, Rules, and Social Norms

arXiv.org Machine Learning

There is an analogy between machine learning systems and economic entities in that they are both adaptive, and their behaviour is specified in a more-or-less explicit way. It appears that the area of AI that is most analogous to the behaviour of economic entities is that of morally good decision-making, but it is an open question as to how precisely moral behaviour can be achieved in an AI system. This paper explores the analogy between these two complex systems, and we suggest that a clearer understanding of this apparent analogy may help us forward in both the socio-economic domain and the AI domain: known results in economics may help inform feasible solutions in AI safety, but also known results in AI may inform economic policy. If this claim is correct, then the recent successes of deep learning for AI suggest that more implicit specifications work better than explicit ones for solving such problems.


Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation

arXiv.org Artificial Intelligence

Lectures translation is a case of spoken language translation and there is a lack of publicly available parallel corpora for this purpose. To address this, we examine a language independent framework for parallel corpus mining which is a quick and effective way to mine a parallel corpus from publicly available lectures at Coursera. Our approach determines sentence alignments, relying on machine translation and cosine similarity over continuous-space sentence representations. We also show how to use the resulting corpora in a multistage fine-tuning based domain adaptation for high-quality lectures translation. For Japanese--English lectures translation, we extracted parallel data of approximately 40,000 lines and created development and test sets through manual filtering for benchmarking translation performance. We demonstrate that the mined corpus greatly enhances the quality of translation when used in conjunction with out-of-domain parallel corpora via multistage training. This paper also suggests some guidelines to gather and clean corpora, mine parallel sentences, address noise in the mined data, and create high-quality evaluation splits. For the sake of reproducibility, we will release our code for parallel data creation.