Machine Translation
Seq2RDF: An end-to-end application for deriving Triples from Natural Language Text
Liu, Yue, Zhang, Tongtao, Liang, Zhicheng, Ji, Heng, McGuinness, Deborah L.
We present an end-to-end approach that takes unstructured textual input and generates structured output compliant with a given vocabulary. Inspired by recent successes in neural machine translation, we treat the triples within a given knowledge graph as an independent graph language and propose an encoder-decoder framework with an attention mechanism that leverages knowledge graph embeddings. Our model learns the mapping from natural language text to triple representation in the form of subject-predicate-object using the selected knowledge graph vocabulary. Experiments on three different data sets show that we achieve competitive F1-Measures over the baselines using our simple yet effective approach. A demo video is included.
Latent Alignment and Variational Attention
Deng, Yuntian, Kim, Yoon, Chiu, Justin, Guo, Demi, Rush, Alexander M.
Neural attention has become central to many state-of-the-art models in natural language processing and related domains. Attention networks are an easy-to-train and effective method for softly simulating alignment; however, the approach does not marginalize over latent alignments in a probabilistic sense. This property makes it difficult to compare attention to other alignment approaches, to compose it with probabilistic models, and to perform posterior inference conditioned on observed data. A related latent approach, hard attention, fixes these issues, but is generally harder to train and less accurate. This work considers variational attention networks, alternatives to soft and hard attention for learning latent variable alignment models, with tighter approximation bounds based on amortized variational inference. We further propose methods for reducing the variance of gradients to make these approaches computationally feasible. Experiments show that for machine translation and visual question answering, inefficient exact latent variable models outperform standard neural attention, but these gains go away when using hard attention based training. On the other hand, variational attention retains most of the performance gain but with training speed comparable to neural attention.
Memory Augmented Policy Optimization for Program Synthesis with Generalization
Liang, Chen, Norouzi, Mohammad, Berant, Jonathan, Le, Quoc, Lao, Ni
This paper presents Memory Augmented Policy Optimization (MAPO): a novel policy optimization formulation that incorporates a memory buffer of promising trajectories to reduce the variance of policy gradient estimates for deterministic environments with discrete actions. The formulation expresses the expected return objective as a weighted sum of two terms: an expectation over a memory of trajectories with high rewards, and a separate expectation over the trajectories outside the memory. We propose 3 techniques to make an efficient training algorithm for MAPO: (1) distributed sampling from inside and outside memory with an actor-learner architecture; (2) a marginal likelihood constraint over the memory to accelerate training; (3) systematic exploration to discover high reward trajectories. MAPO improves the sample efficiency and robustness of policy gradient, especially on tasks with a sparse reward. We evaluate MAPO on weakly supervised program synthesis from natural language with an emphasis on generalization. On the WikiTableQuestions benchmark we improve the state-of-the-art by 2.5%, achieving an accuracy of 46.2%, and on the WikiSQL benchmark, MAPO achieves an accuracy of 74.9% with only weak supervision, outperforming several strong baselines with full supervision. Our code is open sourced at https://github.com/crazydonkey200/neural-symbolic-machines
The Limitations of Machine Learning
Machine learning is one of the newest technologies that is poised to make significant changes in the way companies conduct their business. Machine learning refers to computer technology that relays intelligent output based on algorithmic decisions made after processing a user's input. While still in its infancy, machine learning has already started being rolled out to consumers through different applications, such as Apple's Siri, Amazon's Alexa, and Microsoft's Cortana, among others. Apart from voice, the technology is used to process image data (e.g. Various reports indicate that advanced machine learning systems will leave translators out of work in the near future.
Oracle-free Detection of Translation Issue for Neural Machine Translation
Zheng, Wujie, Wang, Wenyu, Liu, Dian, Zhang, Changrong, Zeng, Qinsong, Deng, Yuetang, Yang, Wei, Xie, Tao
Neural Machine Translation (NMT) has been widely adopted over recent years due to its advantages on various translation tasks. However, NMT systems can be error-prone due to the intractability of natural languages and the design of neural networks, bringing issues to their translations. These issues could potentially lead to information loss, wrong semantics, and low readability in translations, compromising the usefulness of NMT and leading to potential non-trivial consequences. Although there are existing approaches, such as using the BLEU score, on quality assessment and issue detection for NMT, such approaches face two serious limitations. First, such solutions require oracle translations, i.e., reference translations, which are often unavailable, e.g., in production environments. Second, such approaches cannot pinpoint the issue types and locations within translations. To address such limitations, we propose a new approach aiming to precisely detect issues in translations without requiring oracle translations. Our approach focuses on two most prominent issues in NMT translations by including two detection algorithms. Our experimental results show that our new approach could achieve high effectiveness on real-world datasets. Our successful experience on deploying the proposed algorithms in both the development and production environments of WeChat, a messenger app with over one billion of monthly active users, helps eliminate numerous defects of our NMT model, monitor the effectiveness on real-world translation tasks, and collect in-house test cases, producing high industry impact.
Program Language Translation Using a Grammar-Driven Tree-to-Tree Model
Drissi, Mehdi, Watkins, Olivia, Khant, Aditya, Ojha, Vivaswat, Sandoval, Pedro, Segev, Rakia, Weiner, Eric, Keller, Robert
The task of translating between programming languages differs from the challenge of translating natural languages in that programming languages are designed with a far more rigid set of structural and grammatical rules. Previous work has used a tree-to-tree encoder/decoder model to take advantage of the inherent tree structure of programs during translation. Neural decoders, however, by default do not exploit known grammar rules of the target language. In this paper, we describe a tree decoder that leverages knowledge of a language's grammar rules to exclusively generate syntactically correct programs. We find that this grammar-based tree-to-tree model outperforms the state of the art tree-to-tree model in translating between two programming languages on a previously used synthetic task.
Google Translate AI
Being tongue-tied on holiday could become a thing of the past thanks to a major update to Google's Translate feature. Google has now introduced new Translate AI which both Android and iPhone users can take advantage of. Neural machine translation was introduced by Google 2 years ago and this new AI is set to improve on previous translation features with it being able to use offline in 59 languages. These include English, Arabic, Chinese, German, and Hindi, to name a few, with only 35MB being used per language. The Google app will allegedly be able to produce more accurate results than predecessors and at a much faster rate.
The Top GitHub Repositories & Reddit Threads Every Data Scientist should know (June 2018) - Analytics Vidhya
Half the year has flown by and that brings us to the June edition of our popular series โ the top GitHub repositories and Reddit threads from last month. During the course of writing these articles, I have learned so much about machine learning from either open source codes or invaluable discussions among the top data science brains in the world. What makes GitHub special is not just it's code hosting and social collaboration features for data scientists. It has lowered the entry barrier into the open source world and has played a MASSIVE role in spreading knowledge and expanding the machine learning community. We saw some amazing open source code being released in June.
Learning Semantic Sentence Embeddings using Pair-wise Discriminator
Patro, Badri N., Kurmi, Vinod K., Kumar, Sandeep, Namboodiri, Vinay P.
In this paper, we propose a method for obtaining sentence-level embeddings. While the problem of securing word-level embeddings is very well studied, we propose a novel method for obtaining sentence-level embeddings. This is obtained by a simple method in the context of solving the paraphrase generation task. If we use a sequential encoder-decoder model for generating paraphrase, we would like the generated paraphrase to be semantically close to the original sentence. One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far. This is ensured by using a sequential pair-wise discriminator that shares weights with the encoder that is trained with a suitable loss function. Our loss function penalizes paraphrase sentence embedding distances from being too large. This loss is used in combination with a sequential encoder-decoder network. We also validated our method by evaluating the obtained embeddings for a sentiment analysis task. The proposed method results in semantic embeddings and outperforms the state-of-the-art on the paraphrase generation and sentiment analysis task on standard datasets. These results are also shown to be statistically significant.
Potato, potato. Toma6to, I'm going to kill you... How a typo can turn an AI translator against us
Neural-network-based language translators can be tricked into deleting words from sentences or dramatically changing the meaning of a phrase, by strategically inserting typos and numbers. Just like twiddling pixels in a photo, or placing a specially crafted sticker near an object, can make image-recognition systems mistake bananas for toasters, it is possible to alter the translation of a sentence by tweaking the input. This isn't like altering "The black cat" to "The black cap", and making an English-to-French translation AI change its output from "Le chat noir" to "Le chapeau noir." That change is to be expected. No, we're talking about, for example, tweaking "Er ist Geigenbauer und Psychotherapeut" (He is a violin maker and psychotherapist) to "Er ist Geigenbauer und Psy6hothearpeiut", and getting the translation: "He is a brick maker and a psychopath."