Machine Translation
Machine-Translated Knowledge Transfer for Commonsense Causal Reasoning
Yeo, Jinyoung (Pohang University of Science and Technology) | Wang, Geungyu (Yonsei University) | Cho, Hyunsouk (Pohang University of Science and Technology) | Choi, Seungtaek (Yonsei University) | Hwang, Seung-won (Yonsei University)
This paper studies the problem of multilingual causal reasoning in resource-poor languages. Existing approaches, translating into the most probable resource-rich language such as English, suffer in the presence of translation and language gaps between different cultural area, which leads to the loss of causality. To overcome these challenges, our goal is thus to identify key techniques to construct a new causality network of cause-effect terms, targeted for the machine-translated English, but without any language-specific knowledge of resource-poor languages. In our evaluations with three languages, Korean, Chinese, and French, our proposed method consistently outperforms all baselines, achieving up-to 69.0% reasoning accuracy, which is close to the state-of-the-art accuracy 70.2% achieved on English.
RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems
Tao, Chongyang (Peking University) | Mou, Lili (University of Waterloo) | Zhao, Dongyan (Peking University) | Yan, Rui (Peking University)
Open-domain human-computer conversation has been attracting increasing attention over the past few years. However, there does not exist a standard automatic evaluation metric for open-domain dialog systems; researchers usually resort to human annotation for model evaluation, which is time- and labor-intensive. In this paper, we propose RUBER, a Referenced metric and Unreferenced metric Blended Evaluation Routine, which evaluates a reply by taking into consideration both a groundtruth reply and a query (previous user-issued utterance). Our metric is learnable, but its training does not require labels of human satisfaction. Hence, RUBER is flexible and extensible to different datasets and languages. Experiments on both retrieval and generative dialog systems show that RUBER has a high correlation with human annotation, and that RUBER has fair transferability over different datasets.
Attention-via-Attention Neural Machine Translation
Zhao, Shenjian (Shanghai Jiao Tong University) | Zhang, Zhihua ( Peking University )
Since many languages originated from a common ancestral language and influence each other, there would inevitably exist similarities between these languages such as lexical similarity and named entity similarity. In this paper, we leverage these similarities to improve the translation performance in neural machine translation. Specifically, we introduce an attention-via-attention mechanism that allows the information of source-side characters flowing to the target side directly. With this mechanism, the target-side characters will be generated based on the representation of source-side characters when the words are similar. For instance, our proposed neural machine translation system learns to transfer the character-level information of the English word "system" through the attention-via-attention mechanism to generate the Czech word "systém." Consequently, our approach is able to not only achieve a competitive translation performance, but also reduce the model size significantly.
Joint Training for Neural Machine Translation Models with Monolingual Data
Zhang, Zhirui (University of Science and Technology of China) | Liu, Shujie (Microsoft Research) | Li, Mu (Microsoft Research) | Zhou, Ming (Microsoft Research) | Chen, Enhong (University of Science and Technology of China)
Monolingual data have been demonstrated to be helpful in improving translation quality of both statistical machine translation (SMT) systems and neural machine translation (NMT) systems, especially in resource-poor or domain adaptation tasks where parallel data are not rich enough. In this paper, we propose a novel approach to better leveraging monolingual data for neural machine translation by jointly learning source-to-target and target-to-source NMT models for a language pair with a joint EM optimization method. The training process starts with two initial NMT models pre-trained on parallel data for each direction, and these two models are iteratively updated by incrementally decreasing translation losses on training data.In each iteration step, both NMT models are first used to translate monolingual data from one language to the other, forming pseudo-training data of the other NMT model. Then two new NMT models are learnt from parallel data together with the pseudo training data. Both NMT models are expected to be improved and better pseudo-training data can be generated in next step. Experiment results on Chinese-English and English-German translation tasks show that our approach can simultaneously improve translation quality of source-to-target and target-to-source models, significantly outperforming strong baseline systems which are enhanced with monolingual data for model training including back-translation.
Exploring Implicit Feedback for Open Domain Conversation Generation
Zhang, Wei-Nan (Harbin Institute of Technology) | Li, Lingzhi (Harbin Institute of Technology) | Cao, Dongyan (Harbin Institute of Technology) | Liu, Ting (Harbin Institute of Technology)
User feedback can be an effective indicator to the success of the human-robot conversation. However, to avoid to interrupt the online real-time conversation process, explicit feedback is usually gained at the end of a conversation. Alternatively, users' responses usually contain their implicit feedback, such as stance, sentiment, emotion, etc., towards the conversation content or the interlocutors. Therefore, exploring the implicit feedback is a natural way to optimize the conversation generation process. In this paper, we propose a novel reward function which explores the implicit feedback to optimize the future reward of a reinforcement learning based neural conversation model. A simulation strategy is applied to explore the state-action space in training and test. Experimental results show that the proposed approach outperforms the Seq2Seq model and the state-of-the-art reinforcement learning model for conversation generation on automatic and human evaluations on the OpenSubtitles and Twitter datasets.
Improved English to Russian Translation by Neural Suffix Prediction
Song, Kai (Soochow University, Alibaba Group) | Zhang, Yue (Singapore University of Technology and Design) | Zhang, Min (Soochow University) | Luo, Weihua (Alibaba Group)
Neural machine translation (NMT) suffers a performance deficiency when a limited vocabulary fails to cover the source or target side adequately, which happens frequently when dealing with morphologically rich languages. To address this problem, previous work focused on adjusting translation granularity or expanding the vocabulary size. However, morphological information is relatively under-considered in NMT architectures, which may further improve translation quality. We propose a novel method, which can not only reduce data sparsity but also model morphology through a simple but effective mechanism. By predicting the stem and suffix separately during decoding, our system achieves an improvement of up to 1.98 BLEU compared with previous work on English to Russian translation. Our method is orthogonal to different NMT architectures and stably gains improvements on various domains.
Synthesis of Programs from Multimodal Datasets
Thakoor, Shantanu (Stanford University) | Shah, Simoni (Indian Institute of Technology, Bombay) | Ramakrishnan, Ganesh (Indian Institute of Technology, Bombay) | Sanyal, Amitabha (Indian Institute of Technology, Bombay)
We describe MultiSynth, a framework for synthesizing domain-specific programs from a multimodal dataset of examples. Given a domain-specific language (DSL), a dataset is multimodal if there is no single program in the DSL that generalizes over all the examples. Further, even if the examples in the dataset were generalized in terms of a set of programs, the domains of these programs may not be disjoint, thereby leading to ambiguity in synthesis. MultiSynth is a framework that incorporates concepts of synthesizing programs with minimum generality, while addressing the need of accurate prediction. We show how these can be achieved through (i) transformation driven partitioning of the dataset, (ii) least general generalization, for a generalized specification of the input and the output, and (iii) learning to rank, for estimating feature weights in order to map an input to the most appropriate mode in case of ambiguity. We show the effectiveness of our framework in two domains: in the first case, we extend an existing approach for synthesizing programs for XML tree transformations to ambiguous multimodal datasets. In the second case, MultiSynth is used to preorder words for machine translation, by learning permutations of productions in the parse trees of the source side sentences. Our evaluations reflect the effectiveness of our approach.
Can Artificial Intelligence solve the translation challenge in Learning?
Providing learning content in a learner's native language has always been a major challenge for knowledge transfer in global environments. With all technology advancements, the process has remained highly manual – slow, cumbersome, and expensive. Once content is available in a source language, translators are hired – typically through external agencies – who then manually translate into the required language. Then, to ensure your business specific lingo and context was translated correctly, another intensive quality assurance step is done with local experts – which often takes longer than the translation itself, due to resource bottlenecks. Multiply this by lots of content and lots of languages – and add, as a further ingredient, that the original source content may change while translation projects are already underway – and you soon get to unsolvable scalability and funding challenges.
[D] Douglas Hofstadter: The Shallowness of Google Translate • r/MachineLearning
He pulls [a notebook] down--it's from the late 1950s. Ever since he was a teenager, he has captured some 10,000 examples of swapped syllables ("hypodeemic nerdle"), malapropisms ("runs the gambit"), "malaphors" ("easy-go-lucky"), and so on, about half of them committed by Hofstadter himself. He makes photocopies of his notebook pages, cuts them up with scissors, and stores the errors in filing cabinets and labeled boxes around his study.
The Shallowness of Google Translate
As a language lover and an impassioned translator, as a cognitive scientist and a lifelong admirer of the human mind's subtlety, I have followed the attempts to mechanize translation for decades. When I look at an article in Russian, I say, "This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode." Some years later he offered a different viewpoint: "No reasonable person thinks that a machine translation can ever achieve elegance and style. Having devoted one unforgettably intense year of my life to translating Alexander Pushkin's sparkling novel in verse Eugene Onegin into my native tongue (that is, having radically reworked that great Russian work into an English-language novel in verse), I find this remark of Weaver's far more congenial than his earlier remark, which reveals a strangely simplistic view of language.