Machine Translation
Microsoft Translator beefs up offline capabilities with new AI-powered translations
The Microsoft Translator app's most accurate translations, powered by the company's emphasis on artificial intelligence, are now available offline. Microsoft Translator first allowed users to download entire languages for offline translations starting in 2016. But this new update focuses on AI-powered "neural translation technology," which the company says produces translations that are 23 percent more accurate than the previously available offline packs. The technology is also open to third-party developers, allowing them to integrate AI translations into their apps. Offline capabilities are available now on Android devices and iOS devices by the end of the week.
AI and Language Automation: Opportunity or Calamity for Localization Services Providers?
Localization (also referred to as "l10n") is the process of adapting a product or content to a specific geographic locale or market with the aim of giving it the look and feel of having been created specifically for a target market, no matter their language, culture, or location. Language translation and cultural adaptation are obviously a big part of localization, and globally visible companies heavily rely on sophisticated technology and localization engineering to get the job done. Localization is a complex process--some of it is automated by tools, but much of it is still a human-driven, manual undertaking. So it's no wonder that recent AI advances in Machine Translation (MT), as well as the allure of automated one-click translation platforms have caused a stir in the translation and localization industry and some fear that this development might spell doom for language professionals and perhaps even be the end of language service providers (LSPs) altogether. So, is complete push-button localization imminent or hyped?
Microsoft's AI-powered offline translation now runs on any phone
Like many translation apps, Microsoft Translator has only used AI to decipher phrases while you have an internet connection. That's not much help if you're on a vacation in a place where mobile data is just a distant memory. Well, you won't have to sacrifice quality for much longer -- Microsoft has released offline language packs for Translator (currently on Android, iOS and Amazon Fire devices) that use AI for translation when you're offline regardless of your hardware. The move not only provides higher quality translations, but shrinks the size of the language packs by half. If you're a jetsetter, you might not have to shuffle language packs whenever you visit a new country.
The Advent of Huang's Law
It's been known for some time that Moore's law is dying. Transistor densities don't quite rise at the same rates that they used to [1]. For this reason, the last decade of computer scientists have been trained to not expect their code to get faster without effort. Multicore systems for CPU remain hard to program and often require significant tuning on the part of a skilled programmer to achieve. At the same time, the growth of mobile computing has lead to a Cambrian explosion in the broad applications of deployed programs.
Saving language: How will the rise of AI affect linguistics?
Often, there is no "perfect" answer. While much is logical, many elements are harder to explain, untethered as they are to any fixed set of rules. For instance, when is a thought expressed with an indicative vs a subjunctive mood? When to use polite vs casual phrasing in languages such as Korean or Japanese? How to articulate an expression that doesn't exist in a target language?
AI translation needs work after error-filled debut at Boao Forum
The day that machines will surpass the ability of humans in language translation could still be many years away, following the breakdown of Tencent Holdings' artificial intelligence-powered translation system during the Boao Forum for Asia in Hainan province last week. Tencent's simultaneous translation system, which was designed to provide both interpretation and transcripts, made an error-filled debut at the high-profile forum, sometimes known as Asia's Davos. It spouted gibberish that were displayed live on screen at the event and in a WeChat mini program. These included garbled characters, repeated words and even broken Chinese, screenshots of which were widely circulated on social media last week. Tencent took the high ground by conceding the errors and aiming for improvements in future.
Deep Probabilistic Programming Languages: A Qualitative Study
Baudart, Guillaume, Hirzel, Martin, Mandel, Louis
Deep probabilistic programming languages try to combine the advantages of deep learning with those of probabilistic programming languages. If successful, this would be a big step forward in machine learning and programming languages. Unfortunately, as of now, this new crop of languages is hard to use and understand.
Multi-Reward Reinforced Summarization with Saliency and Entailment
Pasunuru, Ramakanth, Bansal, Mohit
Abstractive text summarization is the task of compressing and rewriting a long document into a short summary while maintaining saliency, directed logical entailment, and non-redundancy. In this work, we address these three important aspects of a good summary via a reinforcement learning approach with two novel reward functions: ROUGE-Sal and Entail, on top of a coverage-based baseline. The ROUGESal reward modifies the ROUGE metric by up-weighting the salient phrases/words detected via a keyphrase classifier. The Entail reward gives high (lengthnormalized) scores to logically-entailed summaries using an entailment classifier. Further, we show superior performance improvement when these rewards are combined with traditional metric (ROUGE) based rewards, via our novel and effective multi-reward approach of optimizing multiple rewards simultaneously in alternate mini-batches. Our method achieves the new state-of-the-art results on CNN/Daily Mail dataset as well as strong improvements in a test-only transfer setup on DUC-2002.
Can Neural Machine Translation be Improved with User Feedback?
Kreutzer, Julia, Khadivi, Shahram, Matusov, Evgeny, Riezler, Stefan
We present the first real-world application of methods for improving neural machine translation (NMT) with human reinforcement, based on explicit and implicit user feedback collected on the eBay e-commerce platform. Previous work has been confined to simulation experiments, whereas in this paper we work with real logged feedback for offline bandit learning of NMT parameters. We conduct a thorough analysis of the available explicit user judgments---five-star ratings of translation quality---and show that they are not reliable enough to yield significant improvements in bandit learning. In contrast, we successfully utilize implicit task-based feedback collected in a cross-lingual search task to improve task-specific and machine translation quality metrics.
Reference-less Measure of Faithfulness for Grammatical Error Correction
Evaluation in Monolingual Translation, and particularly in Grammatical Error Correction (GEC) is a challenging research field, much due to the difficulty in integrating different types of rewriting operations into a single measure, and the vast number of valid outputs (Tetreault and Chodorow, 2008; Madnani et al., 2011; Chodorow et al., 2012; Bryant and Ng, 2015). These difficulties have recently motivated a number of proposals for new, improved reference-based measures (RBMs) (Dahlmeier and Ng, 2012; Felice and Briscoe, 2015; Napoles et al., 2015). Nevertheless, the size and heterogeneity of the space of valid outputs per sentence often prohibits obtaining a reference set that covers this space well, thereby limiting the applicability of RBMs (Bryant and Ng, 2015).