Machine Translation
Self-Attentional Models Application in Task-Oriented Dialogue Generation Systems
Mehrjardi, Mansour Saffar, Trabelsi, Amine, Zaiane, Osmar R.
Self-attentional models are a new paradigm for sequence modelling tasks which differ from common sequence modelling methods, such as recurrence-based and convolution-based sequence learning, in the way that their architecture is only based on the attention mechanism. Self-attentional models have been used in the creation of the state-of-the-art models in many NLP tasks such as neural machine translation, but their usage has not been explored for the task of training end-to- end task-oriented dialogue generation systems yet. In this study, we apply these models on the three different datasets for training task-oriented chatbots. Our finding shows that self-attentional models can be exploited to create end-to-end task-oriented chatbots which not only achieve higher evaluation scores compared to recurrence-based models, but also do so more efficiently.
The Unreasonable Effectiveness Of Neural Machine Translation: A Breakthrough In Temporal Expression Understanding
Written by Rakesh Chada and Marcos Jimenez, data scientists at x.ai. At x.ai we strive to make pain associated with scheduling meetings a thing of the past. We've built a virtual assistant (it goes by the name of Amy or Andrew) who can be cc'd into your typical request to meet with people over email. Amy will "understand" the hand-over and just take it from there with your guests, following up with them to nail the time and location details for the meeting. Under the hood this means that Amy must automatically extract meeting-related pieces of information from your email and, mashing that up with your calendar and overall preferences, proceed to get your guests to agree to a time that works for you and them, plus gather whatever other details are needed for the meeting (phone conference number, meeting room, address, google hangout link, etc โฆ). Now the hard, cool, data-science part. Amy "understanding" all the pieces of information from free-form human text presents us with a number of formidable and fascinating data science challenges. This is the realm of natural language processing (NLP), where recent strides in deep learning have made tackling these problems viable. The problem goes far beyond simply detecting words related to times and locations, or named entity recognition (NER).
Practical guide to Attention mechanism for NLU tasks
Chatbots, virtual assistants, augmented analytic systems typically receive user queries such as "Find me an action movie by Steven Spielberg". The system should correctly detect the intent "find_movie" while filling the slots "genre" with value "action" and "directed_by" with value "Steven Spielberg". This is a Natural Language Understanding (NLU) task kown as Intent Classification & Slot Filling. State-of-the-art performance is typically obtained using recurrent neural network (RNN) based approaches, as well as by leveraging an encoder-decoder architecture with sequence-to-sequence models. In this article we demonstrate hands-on strategies for improving the performance even further by adding Attention mechanism.
Problems with automating translation of movie/TV show subtitles
Gupta, Prabhakar, Sharma, Mayank, Pitale, Kartik, Kumar, Keshav
We present 27 problems encountered in automating the translation of movie/TV show subtitles. We categorize each problem in one of the three categories viz. problems directly related to textual translation, problems related to subtitle creation guidelines, and problems due to adaptability of machine translation (MT) engines. We also present the findings of a translation quality evaluation experiment where we share the frequency of 16 key problems. We show that the systems working at the frontiers of Natural Language Processing do not perform well for subtitles and require some post-processing solutions for redressal of these problems
Answers Unite! Unsupervised Metrics for Reinforced Summarization Models
Scialom, Thomas, Lamprier, Sylvain, Piwowarski, Benjamin, Staiano, Jacopo
Abstractive summarization approaches based on Reinforcement Learning (RL) have recently been proposed to overcome classical likelihood maximization. RL enables to consider complex, possibly non-differentiable, metrics that globally assess the quality and relevance of the generated outputs. ROUGE, the most used summarization metric, is known to suffer from bias towards lexical similarity as well as from suboptimal accounting for fluency and readability of the generated abstracts. W e thus explore and propose alternative evaluation measures: the reported human-evaluation analysis shows that the proposed metrics, based on Question Answering, favorably compares to ROUGE - with the additional property of not requiring reference summaries. Training a RL-based model on these metrics leads to improvements (both in terms of human or automated metrics) over current approaches that use ROUGE as a reward.
On the Downstream Performance of Compressed Word Embeddings
May, Avner, Zhang, Jian, Dao, Tri, Rรฉ, Christopher
Compressing word embeddings is important for deploying NLP models in memory-constrained settings. However, understanding what makes compressed embeddings perform well on downstream tasks is challenging---existing measures of compression quality often fail to distinguish between embeddings that perform well and those that do not. We thus propose the eigenspace overlap score as a new measure. We relate the eigenspace overlap score to downstream performance by developing generalization bounds for the compressed embeddings in terms of this score, in the context of linear and logistic regression. We then show that we can lower bound the eigenspace overlap score for a simple uniform quantization compression method, helping to explain the strong empirical performance of this method. Finally, we show that by using the eigenspace overlap score as a selection criterion between embeddings drawn from a representative set we compressed, we can efficiently identify the better performing embedding with up to $2\times$ lower selection error rates than the next best measure of compression quality, and avoid the cost of training a model for each task of interest.
Generating Classical Chinese Poems from Vernacular Chinese
Yang, Zhichao, Cai, Pengshan, Feng, Yansong, Li, Fei, Feng, Weijiang, Chiu, Elena Suet-Ying, Yu, Hong
Classical Chinese poetry is a jewel in the treasure house of Chinese culture. Previous poem generation models only allow users to employ keywords to interfere the meaning of generated poems, leaving the dominion of generation to the model. In this paper, we propose a novel task of generating classical Chinese poems from vernacular, which allows users to have more control over the semantic of generated poems. We adapt the approach of unsupervised machine translation (UMT) to our task. We use segmentation-based padding and reinforcement learning to address under-translation and over-translation respectively. According to experiments, our approach significantly improve the perplexity and BLEU compared with typical UMT models. Furthermore, we explored guidelines on how to write the input vernacular to generate better poems. Human evaluation showed our approach can generate high-quality poems which are comparable to amateur poems.
Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite
Jwalapuram, Prathyusha, Joty, Shafiq, Temnikova, Irina, Nakov, Preslav
The ongoing neural revolution in machine translation has made it easier to model larger contexts beyond the sentence-level, which can potentially help resolve some discourse-level ambiguities such as pronominal anaphora, thus enabling better translations. Unfortunately, even when the resulting improvements are seen as substantial by humans, they remain virtually unnoticed by traditional automatic evaluation measures like BLEU, as only a few words end up being affected. Thus, specialized evaluation measures are needed. With this aim in mind, we contribute an extensive, targeted dataset that can be used as a test suite for pronoun translation, covering multiple source languages and different pronoun errors drawn from real system translations, for English. We further propose an evaluation measure to differentiate good and bad pronoun translations. We also conduct a user study to report correlations with human judgments.
Facebook founds AI Language Research Consortium to solve challenges in natural language processing
Roughly three months ago, Facebook launched calls for research proposals in three subfields of natural language processing (NLP), the cross-disciplinary study of linguistics and AI concerned with computer-language interactions. It specifically sought "robust" deep learning approaches for NLP and computationally efficient NLP in addition to neural machine translation for low-resource dialects, ultimately in the pursuit of advancing cutting-edge research in machine translation. That was just the start, it would seem. In a blog post today announcing 11 winning proposals among the 115 submitted from 35 countries, Facebook announced the AI Language Research Consortium, a community of partners it says will "work together to advance priority research areas" in NLP. Details were tough to come by at press time, but Facebook says the newly formed group will foster collaboration to tackle challenging tasks like representation learning, content understanding, dialog systems, information extraction, sentiment analysis, summarization, data collection and cleaning, and speech translation.
Transformer Dissection: An Unified Understanding for Transformer's Attention via the Lens of Kernel
Tsai, Yao-Hung Hubert, Bai, Shaojie, Yamada, Makoto, Morency, Louis-Philippe, Salakhutdinov, Ruslan
Transformer is a powerful architecture that achieves superior performance on various sequence learning tasks, including neural machine translation, language understanding, and sequence prediction. At the core of the Transformer is the attention mechanism, which concurrently processes all inputs in the streams. In this paper, we present a new formulation of attention via the lens of the kernel. To be more precise, we realize that the attention can be seen as applying kernel smoother over the inputs with the kernel scores being the similarities between inputs. This new formulation gives us a better way to understand individual components of the Transformer's attention, such as the better way to integrate the positional embedding. Another important advantage of our kernel-based formulation is that it paves the way to a larger space of composing Transformer's attention. As an example, we propose a new variant of Transformer's attention which models the input as a product of symmetric kernels. This approach achieves competitive performance to the current state of the art model with less computation. In our experiments, we empirically study different kernel construction strategies on two widely used tasks: neural machine translation and sequence prediction.