Machine Translation
Modeling Intensification for Sign Language Generation: A Computational Approach
İnan, Mert, Zhong, Yang, Hassan, Sabit, Quandt, Lorna, Alikhani, Malihe
End-to-end sign language generation models do not accurately represent the prosody in sign language. A lack of temporal and spatial variations leads to poor-quality generated presentations that confuse human interpreters. In this paper, we aim to improve the prosody in generated sign languages by modeling intensification in a data-driven manner. We present different strategies grounded in linguistics of sign language that inform how intensity modifiers can be represented in gloss annotations. To employ our strategies, we first annotate a subset of the benchmark PHOENIX-14T, a German Sign Language dataset, with different levels of intensification. We then use a supervised intensity tagger to extend the annotated dataset and obtain labels for the remaining portion of it. This enhanced dataset is then used to train state-of-the-art transformer models for sign language generation. We find that our efforts in intensification modeling yield better results when evaluated with automatic metrics. Human evaluation also indicates a higher preference of the videos generated using our model.
Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation
Lam, Tsz Kin, Schamoni, Shigehiko, Riezler, Stefan
End-to-end speech translation relies on data that pair source-language speech inputs with corresponding translations into a target language. Such data are notoriously scarce, making synthetic data augmentation by back-translation or knowledge distillation a necessary ingredient of end-to-end training. In this paper, we present a novel approach to data augmentation that leverages audio alignments, linguistic properties, and translation. First, we augment a transcription by sampling from a suffix memory that stores text and audio data. Second, we translate the augmented transcript. Finally, we recombine concatenated audio segments and the generated translation. Besides training an MT-system, we only use basic off-the-shelf components without fine-tuning. While having similar resource demands as knowledge distillation, adding our method delivers consistent improvements of up to 0.9 and 1.1 BLEU points on five language pairs on CoVoST 2 and on two language pairs on Europarl-ST, respectively.
Divide and Rule: Effective Pre-Training for Context-Aware Multi-Encoder Translation Models
Lupo, Lorenzo, Dinarelli, Marco, Besacier, Laurent
Multi-encoder models are a broad family of context-aware neural machine translation systems that aim to improve translation quality by encoding document-level contextual information alongside the current sentence. The context encoding is undertaken by contextual parameters, trained on document-level data. In this work, we discuss the difficulty of training these parameters effectively, due to the sparsity of the words in need of context (i.e., the training signal), and their relevant context. We propose to pre-train the contextual parameters over split sentence pairs, which makes an efficient use of the available data for two reasons. Firstly, it increases the contextual training signal by breaking intra-sentential syntactic relations, and thus pushing the model to search the context for disambiguating clues more frequently. Secondly, it eases the retrieval of relevant context, since context segments become shorter. We propose four different splitting methods, and evaluate our approach with BLEU and contrastive test sets. Results show that it consistently improves learning of contextual parameters, both in low and high resource settings.
Know More About Natural Language Processing (NLP) & AI
Natural language processing (NLP) is an area of artificial intelligence (AI) that focuses on assisting computers in understanding how humans write and communicate. This is a difficult task because of the large amount of unstructured data. Individuals' speaking and writing styles are unique, and they are continually changing to suit widespread usage. Understanding context is another issue that requires semantic analysis to be solved by machine learning. Natural language understanding (NLU) is a sub-branch of natural language processing (NLP) that deals with these complexities through machine reading comprehension rather than merely comprehending literal meanings. These functions improve as we write, speak, and converse with computers more: they are constantly learning.
Meta's machine translation journey
There are around 7000 languages spoken globally, but most translation models focus on English and other popular languages. This excludes a major part of the world from the benefit of having access to content, technologies and other advantages of being online. Tech giants are trying to bridge this gap. Just days back, Meta announced that it plans to bring out a Universal Speech Translator to translate speech from one language to another in real-time. This announcement is not surprising to anyone who follows the company closely. Meta has been devoted to bringing innovations in machine translations for quite some time now.
Digital Babel Fish: The holy grail of Conversational AI
Yesterday's science fiction is today's invention. Babel Fish, the "oddest thing in the universe", is a species of fish featured in Douglas Adam's magnum opus, The Hitchhiker's Guide to Galaxy. The fish, worn as an earpiece, translates all the languages that ever existed instantly. Babel Fish is no longer the stuff of dreams: Thanks to advances in AI, especially in the NLP domain, many tech giants are in the process of building a universal translator. To that end, Universal Speech Translator was a dominant theme in the Meta's Inside the Lab event on February 23.
Semantic-Preserving Linguistic Steganography by Pivot Translation and Semantic-Aware Bins Coding
Yang, Tianyu, Wu, Hanzhou, Yi, Biao, Feng, Guorui, Zhang, Xinpeng
Linguistic steganography (LS) aims to embed secret information into a highly encoded text for covert communication. It can be roughly divided to two main categories, i.e., modification based LS (MLS) and generation based LS (GLS). Unlike MLS that hides secret data by slightly modifying a given text without impairing the meaning of the text, GLS uses a trained language model to directly generate a text carrying secret data. A common disadvantage for MLS methods is that the embedding payload is very low, whose return is well preserving the semantic quality of the text. In contrast, GLS allows the data hider to embed a high payload, which has to pay the high price of uncontrollable semantics. In this paper, we propose a novel LS method to modify a given text by pivoting it between two different languages and embed secret data by applying a GLS-like information encoding strategy. Our purpose is to alter the expression of the given text, enabling a high payload to be embedded while keeping the semantic information unchanged. Experimental results have shown that the proposed work not only achieves a high embedding payload, but also shows superior performance in maintaining the semantic consistency and resisting linguistic steganalysis.
Baidu Launches Digital Platform for AI Sign Language
Baidu AI Cloud launched a sign language platform on Thursday, able to generate digital avatars for sign language translation and live interpretation within minutes. Released as a new offering of Baidu AI Cloud's digital avatar platform XiLing, this new product aims to help break down communication barriers for the deaf and hard-of-hearing (DHH) community by boosting the accessibility of automated sign language translation. An AI sign language interpreter developed using the platform will perform its duties during the upcoming 2022 Beijing Winter Paralympic Games. Also released with the platform on Thursday were two all-in-one AI sign language translators, providing one-stop solutions with a streamlined set-up process and plug-and-use features. With the technological changes brought by AI, production and operational costs of digital avatars have been reduced to a significant degree, making it possible for AI sign language to scale up and serve more DHH individuals, said Tian Wu, Baidu Corporate Vice President.
What to Expect from the Language Industry in 2022
The language industry is having a moment. The ongoing global health crisis has forced organizations to break down borders and support a global remote workforce, requiring more cross-language interactions and coordination than ever before. At the same time, technological innovations in the language translation industry are at an all time high. We've never before had access to such sophisticated technology tools to manage translation processes. I predict it's going to be an exciting year in the industry, with an unprecedented level of innovation.
Paper Review: Meta-Learning for Low-Resource Neural Machine Translation
So, without further ado, let's jump into this awesome paper. This paper talks about low resource Neural Machine Translation which means translating less common language to English or other famous languages. This task is defined as a task under the umbrella of Meta-learning because there is not a lot of translation present for languages like Romanian or other regional languages. The proposed methodology should learn from the commonly available language translations and use that knowledge to convert Romanian or Finnish to English. Let's define the problem in a technical manner.