Goto

Collaborating Authors

 Machine Translation


Best Machine language Translators

#artificialintelligence

Machine language translators have improved a lot over the years. They have become earlier to use and produce accurate translations at cheaper to no cost. For localization translation machine translation services and software have served as a boon. The neural machine translation algorithm makes the delivery of translations natural. Let's take a look at the best machine translation engines in 2022


New voices in AI: David Adelani

AIHub

Welcome to the first episode of New voices in AI! You can find David on Twitter @davlanade and find out more about Masakhane here. The music used is'Wholesome' by Kevin MacLeod, Licensed under Creative Commons Daly: Hello and welcome to new voices in AI, this a new series from AIhub where we celebrate the voices PhD students, early career researchers, and those with a new perspective on AI. And without further ado, let's begin. First up, a big welcome to our very first guest on "New voices in AI" and if you could introduce yourself, who are you? Adelani: Thank you very much for having me. So, Masakhane is this grassroots organization, whose mission is to strengthen and spur NLP research in African languages, by Africans for Africans, so, and currently the organization we are majorly operating on Slack we already have over 1000 Members. Of course, not everyone is active but we have more than 100 or close to 100 active members as well, yeah. So how did, how did you get into AI?


"Artificial Intelligence" Science-Research, January 2022, Week 3 -- summary from Europe PMC

#artificialintelligence

Background Liver is one of the most typical metastatic sites of colon cancer cells and liver metastasis determines subsequent therapy along with prognosis of patients, particularly in T1 patients. There is still no effective model to predict the danger of LM in T1 CRC patients. Objectives Chest radiographs are commonly performed in emergency units, yet the interpretation calls for radiology experience. Presently, top quality English-Chinese parallel corpus is presently in a phase of shortage. After that, the multilingual dictionary summed up by the translation model is combined with the language model, unsupervised translation model is initialized, unsupervised English-Chinese neural machine translation model is optimized with the back translation technique.


An Empirical Study on the Overlapping Problem of Open-Domain Dialogue Datasets

arXiv.org Artificial Intelligence

Open-domain dialogue systems aim to converse with humans through text, and its research has heavily relied on benchmark datasets. In this work, we first identify the overlapping problem in DailyDialog and OpenSubtitles, two popular open-domain dialogue benchmark datasets. Our systematic analysis then shows that such overlapping can be exploited to obtain fake state-of-the-art performance. Finally, we address this issue by cleaning these datasets and setting up a proper data processing procedure for future research.


Cost-Effective Training in Low-Resource Neural Machine Translation

arXiv.org Artificial Intelligence

While Active Learning (AL) techniques are explored in Neural Machine Translation (NMT), only a few works focus on tackling low annotation budgets where a limited number of sentences can get translated. Such situations are especially challenging and can occur for endangered languages with few human annotators or having cost constraints to label large amounts of data. Although AL is shown to be helpful with large budgets, it is not enough to build high-quality translation systems in these low-resource conditions. In this work, we propose a cost-effective training procedure to increase the performance of NMT models utilizing a small number of annotated sentences and dictionary entries. Our method leverages monolingual data with self-supervised objectives and a small-scale, inexpensive dictionary for additional supervision to initialize the NMT model before applying AL. We show that improving the model using a combination of these knowledge sources is essential to exploit AL strategies and increase gains in low-resource conditions. We also present a novel AL strategy inspired by domain adaptation for NMT and show that it is effective for low budgets. We propose a new hybrid data-driven approach, which samples sentences that are diverse from the labelled data and also most similar to unlabelled data. Finally, we show that initializing the NMT model and further using our AL strategy can achieve gains of up to $13$ BLEU compared to conventional AL methods.


Natural Language Processing- How different NLP Algorithms work

#artificialintelligence

Natural Language Processing (NLP) is an area in computer science that studies the interactions between computers and human languages. It is the technology behind search engines such as Google. The analysis of language can be done manually, and it has been done for centuries. But technology continues to evolve, which is especially true in natural language processing (NLP). The Machine and Deep Learning communities have been actively pursuing Natural Language Processing (NLP) through various techniques.


SMDT: Selective Memory-Augmented Neural Document Translation

arXiv.org Artificial Intelligence

Existing document-level neural machine translation (NMT) models have sufficiently explored different context settings to provide guidance for target generation. However, little attention is paid to inaugurate more diverse context for abundant context information. In this paper, we propose a Selective Memory-augmented Neural Document Translation model to deal with documents containing large hypothesis space of the context. Specifically, we retrieve similar bilingual sentence pairs from the training corpus to augment global context and then extend the two-stream attention model with selective mechanism to capture local context and diverse global contexts. This unified approach allows our model to be trained elegantly on three publicly document-level machine translation datasets and significantly outperforms previous document-level NMT models.


Pixel Recursive Super Resolution. Paper @Google Brain. Ryan Dahl, Mohammad Norouzi & Jonathon Shlens

#artificialintelligence

Research ... hoy traemos a este espacio otro paper de Google ... aquí os dejamos el Abstract We present a pixel recursive super resolution model that synthesizes realistic details into images while enhancing their resolution. A low resolution image may correspond to multiple plausible high resolution images, thus modeling the super resolution process with a pixel independent conditional model often results in averaging different details–hence blurry edges. By contrast, our model is able to represent a multimodal conditional distribution by properly modeling the statistical dependencies among the high resolution image pixels, conditioned on a low resolution input. We employ a PixelCNN architecture to define a strong prior over natural images and jointly optimize this prior with a deep conditioning convolutional network. Human evaluations indicate that samples from our proposed model look.(leer


Diformer: Directional Transformer for Neural Machine Translation

arXiv.org Artificial Intelligence

Autoregressive (AR) and Non-autoregressive (NAR) models have their own superiority on the performance and latency, combining them into one model may take advantage of both. Current combination frameworks focus more on the integration of multiple decoding paradigms with a unified generative model, e.g. Masked Language Model. However, the generalization can be harmful to the performance due to the gap between training objective and inference. In this paper, we aim to close the gap by preserving the original objective of AR and NAR under a unified framework. Specifically, we propose the Directional Transformer (Diformer) by jointly modelling AR and NAR into three generation directions (left-to-right, right-to-left and straight) with a newly introduced direction variable, which works by controlling the prediction of each token to have specific dependencies under that direction. The unification achieved by direction successfully preserves the original dependency assumption used in AR and NAR, retaining both generalization and performance. Experiments on 4 WMT benchmarks demonstrate that Diformer outperforms current united-modelling works with more than 1.5 BLEU points for both AR and NAR decoding, and is also competitive to the state-of-the-art independent AR and NAR models.


A Preordered RNN Layer Boosts Neural Machine Translation in Low Resource Settings

arXiv.org Artificial Intelligence

Neural Machine Translation (NMT) models are strong enough to convey semantic and syntactic information from the source language to the target language. However, these models are suffering from the need for a large amount of data to learn the parameters. As a result, for languages with scarce data, these models are at risk of underperforming. We propose to augment attention based neural network with reordering information to alleviate the lack of data. This augmentation improves the translation quality for both English to Persian and Persian to English by up to 6% BLEU absolute over the baseline models.