"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).
The failures of artificial intelligent systems have become a recurring theme in technology news. Recommendation systems that promote violent content. Trending algorithms that amplify fake news. Most complex software systems fail at some point and need to be updated regularly. We have procedures and tools that help us find and fix these errors.
Continual learning (CL) aims to enable information systems to learn from a continuous data stream across time. However, it is difficult for existing deep learning architectures to learn a new task without largely forgetting previously acquired knowledge. Furthermore, CL is particularly challenging for language learning, as natural language is ambiguous: it is discrete, compositional, and its meaning is context-dependent. In this work, we look at the problem of CL through the lens of various NLP tasks. Our survey discusses major challenges in CL and current methods applied in neural network models. We also provide a critical review of the existing CL evaluation methods and datasets in NLP.
This article investigates multilingual evidence retrieval and claim verification as a step to combat global disinformation, a first effort of this kind, to the best of our knowledge. A 400 example mixed language English-Romanian dataset is created for cross-lingual transfer learning evaluation. We make code, datasets, and trained models available upon publication.
Amazon has introduced the new Live Translation feature to Alexa, enabling real-time translations between certain languages in both voice and text form. The feature uses the same AI models as Alexa's bilingual understanding to recognize which side of several pairs of languages is being spoken and translating to the other. Right now, the translations are limited to English and with French, Spanish, Hindi, German, Italian, or Brazilian Portuguese. Live Translate available on any Echo device by asking Alexa in English to translate German or French, or any of the other languages. When the voice assistant beeps, the user can speak either language naturally and Alexa will subsequently repeat back what was said in the other language.
Ensemble approaches are commonly used techniques to improving a system by combining multiple model predictions. Additionally these schemes allow the uncertainty, as well as the source of the uncertainty, to be derived for the prediction. Unfortunately these benefits come at a computational and memory cost. To address this problem ensemble distillation (EnD) and more recently ensemble distribution distillation (EnDD) have been proposed that compress the ensemble into a single model, representing either the ensemble average prediction or prediction distribution respectively. This paper examines the application of both these distillation approaches to a sequence prediction task, grammatical error correction (GEC). This is an important application area for language learning tasks as it can yield highly useful feedback to the learner. It is, however, more challenging than the standard tasks investigated for distillation as the prediction of any grammatical correction to a word will be highly dependent on both the input sequence and the generated output history for the word. The performance of both EnD and EnDD are evaluated on both publicly available GEC tasks as well as a spoken language task.
Khashabi, Daniel, Cohan, Arman, Shakeri, Siamak, Hosseini, Pedram, Pezeshkpour, Pouya, Alikhani, Malihe, Aminnaseri, Moin, Bitaab, Marzieh, Brahman, Faeze, Ghazarian, Sarik, Gheini, Mozhdeh, Kabiri, Arman, Mahabadi, Rabeeh Karimi, Memarrast, Omid, Mosallanezhad, Ahmadreza, Noury, Erfan, Raji, Shahab, Rasooli, Mohammad Sadegh, Sadeghi, Sepideh, Azer, Erfan Sadeqi, Samghabadi, Niloofar Safi, Shafaei, Mahsa, Sheybani, Saber, Tazarv, Ali, Yaghoobzadeh, Yadollah
Despite the progress made in recent years in addressing natural language understanding (NLU) challenges, the majority of this progress remains to be concentrated on resource-rich languages like English. This work focuses on Persian language, one of the widely spoken languages in the world, and yet there are few NLU datasets available for this rich language. The availability of high-quality evaluation datasets is a necessity for reliable assessment of the progress on different NLU tasks and domains. We introduce ParsiNLU, the first benchmark in Persian language that includes a range of high-level tasks -- Reading Comprehension, Textual Entailment, etc. These datasets are collected in a multitude of ways, often involving manual annotations by native speakers. This results in over 14.5$k$ new instances across 6 distinct NLU tasks. Besides, we present the first results on state-of-the-art monolingual and multi-lingual pre-trained language-models on this benchmark and compare them with human performance, which provides valuable insights into our ability to tackle natural language understanding challenges in Persian. We hope ParsiNLU fosters further research and advances in Persian language understanding.
As part of a larger project on optimal learning conditions in neural machine translation, we investigate characteristic training phases of translation engines. All our experiments are carried out using OpenNMT-Py: the pre-processing step is implemented using the Europarl training corpus and the INTERSECT corpus is used for validation. Longitudinal analyses of training phases suggest that the progression of translations is not always linear. Following the results of textometric explorations, we identify the importance of the phenomena related to chronological progression, in order to map different processes at work in neural machine translation (NMT).
This article contains a proposal to add coinduction to the computational apparatus of natural language understanding. This, we argue, will provide a basis for more realistic, computationally sound, and scalable models of natural language dialogue, syntax and semantics. Given that the bottom up, inductively constructed, semantic and syntactic structures are brittle, and seemingly incapable of adequately representing the meaning of longer sentences or realistic dialogues, natural language understanding is in need of a new foundation. Coinduction, which uses top down constraints, has been successfully used in the design of operating systems and programming languages. Moreover, implicitly it has been present in text mining, machine translation, and in some attempts to model intensionality and modalities, which provides evidence that it works. This article shows high level formalizations of some of such uses. Since coinduction and induction can coexist, they can provide a common language and a conceptual model for research in natural language understanding. In particular, such an opportunity seems to be emerging in research on compositionality. This article shows several examples of the joint appearance of induction and coinduction in natural language processing. We argue that the known individual limitations of induction and coinduction can be overcome in empirical settings by a combination of the the two methods. We see an open problem in providing a theory of their joint use.
Memes are used for spreading ideas through social networks. Although most memes are created for humor, some memes become hateful under the combination of pictures and text. Automatically detecting the hateful memes can help reduce their harmful social impact. Unlike the conventional multimodal tasks, where the visual and textual information is semantically aligned, the challenge of hateful memes detection lies in its unique multimodal information. The image and text in memes are weakly aligned or even irrelevant, which requires the model to understand the content and perform reasoning over multiple modalities. In this paper, we focus on multimodal hateful memes detection and propose a novel method that incorporates the image captioning process into the memes detection process. We conduct extensive experiments on multimodal meme datasets and illustrated the effectiveness of our approach. Our model achieves promising results on the Hateful Memes Detection Challenge.
Despite the recent success on image classification, self-training has only achieved limited gains on structured prediction tasks such as neural machine translation (NMT). This is mainly due to the compositionality of the target space, where the far-away prediction hypotheses lead to the notorious reinforced mistake problem. In this paper, we revisit the utilization of multiple diverse models and present a simple yet effective approach named Reciprocal-Supervised Learning (RSL). RSL first exploits individual models to generate pseudo parallel data, and then cooperatively trains each model on the combined synthetic corpus. RSL leverages the fact that different parameterized models have different inductive biases, and better predictions can be made by jointly exploiting the agreement among each other. Unlike the previous knowledge distillation methods built upon a much stronger teacher, RSL is capable of boosting the accuracy of one model by introducing other comparable or even weaker models. RSL can also be viewed as a more efficient alternative to ensemble. Extensive experiments demonstrate the superior performance of RSL on several benchmarks with significant margins.