Goto

Collaborating Authors

 Machine Translation


Simple Yet Effective Neural Ranking and Reranking Baselines for Cross-Lingual Information Retrieval

arXiv.org Artificial Intelligence

The advent of multilingual language models has generated a resurgence of interest in cross-lingual information retrieval (CLIR), which is the task of searching documents in one language with queries from another. However, the rapid pace of progress has led to a confusing panoply of methods and reproducibility has lagged behind the state of the art. In this context, our work makes two important contributions: First, we provide a conceptual framework for organizing different approaches to cross-lingual retrieval using multi-stage architectures for mono-lingual retrieval as a scaffold. Second, we implement simple yet effective reproducible baselines in the Anserini and Pyserini IR toolkits for test collections from the TREC 2022 NeuCLIR Track, in Persian, Russian, and Chinese. Our efforts are built on a collaboration of the two teams that submitted the most effective runs to the TREC evaluation. These contributions provide a firm foundation for future advances.


Semi-supervised Neural Machine Translation with Consistency Regularization for Low-Resource Languages

arXiv.org Artificial Intelligence

The advent of deep learning has led to a significant gain in machine translation. However, most of the studies required a large parallel dataset which is scarce and expensive to construct and even unavailable for some languages. This paper presents a simple yet effective method to tackle this problem for low-resource languages by augmenting high-quality sentence pairs and training NMT models in a semi-supervised manner. Specifically, our approach combines the cross-entropy loss for supervised learning with KL Divergence for unsupervised fashion given pseudo and augmented target sentences derived from the model. We also introduce a SentenceBERT-based filter to enhance the quality of augmenting data by retaining semantically similar sentence pairs. Experimental results show that our approach significantly improves NMT baselines, especially on low-resource datasets with 0.46--2.03 BLEU scores. We also demonstrate that using unsupervised training for augmented data is more efficient than reusing the ground-truth target sentences for supervised learning.


Ham2Pose: Animating Sign Language Notation into Pose Sequences

arXiv.org Artificial Intelligence

Translating spoken languages into Sign languages is necessary for open communication between the hearing and hearing-impaired communities. To achieve this goal, we propose the first method for animating a text written in HamNoSys, a lexical Sign language notation, into signed pose sequences. As HamNoSys is universal by design, our proposed method offers a generic solution invariant to the target Sign language. Our method gradually generates pose predictions using transformer encoders that create meaningful representations of the text and poses while considering their spatial and temporal information. We use weak supervision for the training process and show that our method succeeds in learning from partial and inaccurate data. Additionally, we offer a new distance measurement that considers missing keypoints, to measure the distance between pose sequences using DTW-MJE. We validate its correctness using AUTSL, a large-scale Sign language dataset, show that it measures the distance between pose sequences more accurately than existing measurements, and use it to assess the quality of our generated pose sequences. Code for the data pre-processing, the model, and the distance measurement is publicly released for future research.


Exploiting Multilingualism in Low-resource Neural Machine Translation via Adversarial Learning

arXiv.org Artificial Intelligence

Generative Adversarial Networks (GAN) offer a promising approach for Neural Machine Translation (NMT). However, feeding multiple morphologically languages into a single model during training reduces the NMT's performance. In GAN, similar to bilingual models, multilingual NMT only considers one reference translation for each sentence during model training. This single reference translation limits the GAN model from learning sufficient information about the source sentence representation. Thus, in this article, we propose Denoising Adversarial Auto-encoder-based Sentence Interpolation (DAASI) approach to perform sentence interpolation by learning the intermediate latent representation of the source and target sentences of multilingual language pairs. Apart from latent representation, we also use the Wasserstein-GAN approach for the multilingual NMT model by incorporating the model generated sentences of multiple languages for reward computation. This computed reward optimizes the performance of the GAN-based multilingual model in an effective manner. We demonstrate the experiments on low-resource language pairs and find that our approach outperforms the existing state-of-the-art approaches for multilingual NMT with a performance gain of up to 4 BLEU points. Moreover, we use our trained model on zero-shot language pairs under an unsupervised scenario and show the robustness of the proposed approach.


From Text to Meaning: How Natural Language Processing Algorithms Work

#artificialintelligence

Natural language processing (NLP) is a field of study that combines computer science and linguistics to help machines understand human language. NLP has become an integral part of modern technology, powering everything from chatbots to voice assistants. But how exactly do NLP algorithms work? And why do they matter? At its core, NLP is about teaching machines to understand human language.


A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision

arXiv.org Artificial Intelligence

There has been a recent explosion of computer vision models which perform many tasks and are composed of an image encoder (usually a ViT) and an autoregressive decoder (usually a Transformer). However, most of this work simply presents one system and its results, leaving many questions regarding design decisions and trade-offs of such systems unanswered. In this work, we aim to provide such answers. We take a close look at autoregressive decoders for multi-task learning in multimodal computer vision, including classification, captioning, visual question answering, and optical character recognition. Through extensive systematic experiments, we study the effects of task and data mixture, training and regularization hyperparameters, conditioning type and specificity, modality combination, and more. Importantly, we compare these to well-tuned single-task baselines to highlight the cost incurred by multi-tasking. A key finding is that a small decoder learned on top of a frozen pretrained encoder works surprisingly well. We call this setup locked-image tuning with decoder (LiT-decoder). It can be seen as teaching a decoder to interact with a pretrained vision model via natural language.


XL8 Integrates Zixi, Enhancing Global Reach of Content

#artificialintelligence

Zixi, the industry leader for enabling cost-efficient and highly scalable live broadcast-quality video over any IP network or protocol and provider of the award-winning SDVP, announced a partnership with XL8 that has integrated Zixi into their innovative LiveSubs translation engine to create real-time subtitles powered by its proprietary state-of-the-art AI technology. LiveSubs allows customers to take their Zixi stream and generate live subtitled languages on the fly, from the source language into over 70 global language pairs. Media companies are under increasing pressure to meet the worldwide demand for hyper-localized translated media in the live distribution space and XL8 takes AI-powered machine translation, specially optimized for media content, to the next level. Its advanced technology allows significantly more efficient workflows by providing in-line editing, automated media transcription with time coding, automated subtitling, synthesized voice dubbing, real-time meeting interpretation including a soon-to-be-released Zoom app, and live subtitling. XL8's uniquely specialized translation engines have been built from the ground up utilizing professionally trained, human-perfected subtitles curated from the media industry's top content producers.


Machine Translation with Attention in TensorFlow Python from Scratch

#artificialintelligence

Sequence to Sequence (Seq2Seq) models have been used extensively in various Natural Language Processing (NLP) tasks such as machine translation, text summarization, and question answering. In this blog post, we will implement a Seq2Seq model for Italian-to-English machine translation using TensorFlow and Python OOPs. The model architecture will consist of an Encoder, a Decoder, and an Attention mechanism. The first step in any machine learning task is to preprocess the data. We will be using a dataset of Italian-English sentence pairs for our translation task.


Language-Family Adapters for Low-Resource Multilingual Neural Machine Translation

arXiv.org Artificial Intelligence

Large multilingual models trained with self-supervision achieve state-of-the-art results in a wide range of natural language processing tasks. Self-supervised pretrained models are often fine-tuned on parallel data from one or multiple language pairs for machine translation. Multilingual fine-tuning improves performance on low-resource languages but requires modifying the entire model and can be prohibitively expensive. Training a new adapter on each language pair or training a single adapter on all language pairs without updating the pretrained model has been proposed as a parameter-efficient alternative. However, the former does not permit any sharing between languages, while the latter shares parameters for all languages and is susceptible to negative interference. In this paper, we propose training language-family adapters on top of mBART-50 to facilitate cross-lingual transfer. Our approach outperforms related baselines, yielding higher translation scores on average when translating from English to 17 different low-resource languages. We also show that language-family adapters provide an effective method to translate to languages unseen during pretraining.


Hallucinations in Large Multilingual Translation Models

arXiv.org Artificial Intelligence

Large-scale multilingual machine translation systems have demonstrated remarkable ability to translate directly between numerous languages, making them increasingly appealing for real-world applications. However, when deployed in the wild, these models may generate hallucinated translations which have the potential to severely undermine user trust and raise safety concerns. Existing research on hallucinations has primarily focused on small bilingual models trained on high-resource languages, leaving a gap in our understanding of hallucinations in massively multilingual models across diverse translation scenarios. In this work, we fill this gap by conducting a comprehensive analysis on both the M2M family of conventional neural machine translation models and ChatGPT, a general-purpose large language model~(LLM) that can be prompted for translation. Our investigation covers a broad spectrum of conditions, spanning over 100 translation directions across various resource levels and going beyond English-centric language pairs. We provide key insights regarding the prevalence, properties, and mitigation of hallucinations, paving the way towards more responsible and reliable machine translation systems.