Machine Translation
Tools For Building Machine Learning Models On Android
Ever since Android first came into existence in 2008, it has become the world's biggest mobile platform in terms of popularity and number of users. Over the years, Android developers have built advances in machine learning, features like on-device speech recognition, real-time video interactiveness, and real-time enhancements when taking a photo/selfie. In addition, image recognition with machine learning can enable users to point their smartphone camera at text and have it live-translated into 88 different languages with the help of Google Translate. Android users can even point your camera at a beautiful flower, use Google Lens to identify what type of flower that is, and then set a reminder to order a bouquet for someone. Google Lens is able to use computer vision models to expand and speed up web search and mobile experience.
A Game Of Telephone: How Accurate Can Translation Really Be?
Imagine sitting in a circle with a few people where each of you knows only two languages -- one shared with the person on your left, and one shared with the person on your right. If you say something to the person on your right and ask them to pass on the message, it might very well be that, after being passed along all the languages, it comes out sounding very different from the original message. This might seem like a very weird game of Telephone to you, but in the same way that whispering impairs your ability to hear the message, so translation works as an imperfect communication channel. When you try to translate a message into a different language, you can change its intended meaning without being aware of it. Oftentimes messages are subjective, ambiguous, or, in some cases, even impossible to represent without any loss of information. But why is translation such a challenge? And in being so, can we ever achieve such a thing as a perfect translation?
Pre-training via Paraphrasing
Lewis, Mike, Ghazvininejad, Marjan, Ghosh, Gargi, Aghajanyan, Armen, Wang, Sida, Zettlemoyer, Luke
We introduce MARGE, a pre-trained sequence-to-sequence model learned with an unsupervised multi-lingual multi-document paraphrasing objective. MARGE provides an alternative to the dominant masked language modeling paradigm, where we self-supervise the reconstruction of target text by retrieving a set of related texts (in many languages) and conditioning on them to maximize the likelihood of generating the original. We show it is possible to jointly learn to do retrieval and reconstruction, given only a random initialization. The objective noisily captures aspects of paraphrase, translation, multi-document summarization, and information retrieval, allowing for strong zero-shot performance on several tasks. For example, with no additional task-specific training we achieve BLEU scores of up to 35.8 for document translation. We further show that fine-tuning gives strong performance on a range of discriminative and generative tasks in many languages, making MARGE the most generally applicable pre-training method to date.
Neural Machine Translation For Paraphrase Generation
Sokolov, Alex, Filimonov, Denis
Training a spoken language understanding system, as the one in Alexa, typically requires a large human-annotated corpus of data. Manual annotations are expensive and time consuming. In Alexa Skill Kit (ASK) user experience with the skill greatly depends on the amount of data provided by skill developer. In this work, we present an automatic natural language generation system, capable of generating both human-like interactions and annotations by the means of paraphrasing. Our approach consists of machine translation (MT) inspired encoder-decoder deep recurrent neural network. We evaluate our model on the impact it has on ASK skill, intent, named entity classification accuracy and sentence level coverage, all of which demonstrate significant improvements for unseen skills on natural language understanding (NLU) models, trained on the data augmented with paraphrases.
What I learned from looking at 200 machine learning tools
To better understand the landscape of available tools for machine learning production, I decided to look up every AI/ML tool I could find. After filtering out applications companies (e.g. companies that use ML to provide business analytics), tools that aren't being actively developed, and tools that nobody uses, I got 202 tools. Please let me know if there are tools you think I should include but aren't on the list yet! The landscape is under-developed IV. I categorize the tools based on which step of the workflow that it supports.
DeepMnemonic: Password Mnemonic Generation via Deep Attentive Encoder-Decoder Model
Cheng, Yao, Xu, Chang, Hai, Zhen, Li, Yingjiu
Strong passwords are fundamental to the security of password-based user authentication systems. In recent years, much effort has been made to evaluate password strength or to generate strong passwords. Unfortunately, the usability or memorability of the strong passwords has been largely neglected. In this paper, we aim to bridge the gap between strong password generation and the usability of strong passwords. We propose to automatically generate textual password mnemonics, i.e., natural language sentences, which are intended to help users better memorize passwords. We introduce \textit{DeepMnemonic}, a deep attentive encoder-decoder framework which takes a password as input and then automatically generates a mnemonic sentence for the password. We conduct extensive experiments to evaluate DeepMnemonic on the real-world data sets. The experimental results demonstrate that DeepMnemonic outperforms a well-known baseline for generating semantically meaningful mnemonic sentences. Moreover, the user study further validates that the generated mnemonic sentences by DeepMnemonic are useful in helping users memorize strong passwords.
Differentiable Window for Dynamic Local Attention
Nguyen, Thanh-Tung, Nguyen, Xuan-Phi, Joty, Shafiq, Li, Xiaoli
We propose Differentiable Window, a new neural module and general purpose component for dynamic window selection. While universally applicable, we demonstrate a compelling use case of utilizing Differentiable Window to improve standard attention modules by enabling more focused attentions over the input regions. We propose two variants of Differentiable Window, and integrate them within the Transformer architecture in two novel ways. We evaluate our proposed approach on a myriad of NLP tasks, including machine translation, sentiment analysis, subject-verb agreement and language modeling. Our experimental results demonstrate consistent and sizable improvements across all tasks.
Self-Knowledge Distillation: A Simple Way for Better Generalization
Kim, Kyungyul, Ji, ByeongMoon, Yoon, Doyoung, Hwang, Sangheum
The generalization capability of deep neural networks has been substantially improved by applying a wide spectrum of regularization methods, e.g., restricting function space, injecting randomness during training, augmenting data, etc. In this work, we propose a simple yet effective regularization method named self-knowledge distillation (Self-KD), which progressively distills a model's own knowledge to soften hard targets (i.e., one-hot vectors) during training. Hence, it can be interpreted within a framework of knowledge distillation as a student becomes a teacher itself. The proposed method is applicable to any supervised learning tasks with hard targets and can be easily combined with existing regularization methods to further enhance the generalization performance. Furthermore, we show that Self-KD achieves not only better accuracy, but also provides high quality of confidence estimates. Extensive experimental results on three different tasks, image classification, object detection, and machine translation, demonstrate that our method consistently improves the performance of the state-of-the-art baselines, and especially, it achieves state-of-the-art BLEU score of 30.0 and 36.2 on IWSLT15 English-to-German and German-to-English tasks, respectively.
Modelling High-Level Mathematical Reasoning in Mechanised Declarative Proofs
Li, Wenda, Yu, Lei, Wu, Yuhuai, Paulson, Lawrence C.
Mathematical proofs can be mechanised using proof assistants to eliminate gaps and errors. However, mechanisation still requires intensive labour. To promote automation, it is essential to capture high-level human mathematical reasoning, which we address as the problem of generating suitable propositions. We build a non-synthetic dataset from the largest repository of mechanised proofs and propose a task on causal reasoning, where a model is required to fill in a missing intermediate proposition given a causal context. Our experiments (using various neural sequence-to-sequence models) reveal that while the task is challenging, neural models can indeed capture non-trivial mathematical reasoning. We further propose a hierarchical transformer model that outperforms the transformer baseline.
Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus
Bentivogli, Luisa, Savoldi, Beatrice, Negri, Matteo, Di Gangi, Mattia Antonino, Cattoni, Roldano, Turchi, Marco
Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines. This difficulty is also due to the fact that the training data on which models are built typically reflect the asymmetries of natural languages, gender bias included. Exclusively fed with textual data, machine translation is intrinsically constrained by the fact that the input sentence does not always contain clues about the gender identity of the referred human entities. But what happens with speech translation, where the input is an audio signal? Can audio provide additional information to reduce gender bias? We present the first thorough investigation of gender bias in speech translation, contributing with: i) the release of a benchmark useful for future studies, and ii) the comparison of different technologies (cascade and end-to-end) on two language directions (English-Italian/French).