Machine Translation
Machine Learning Behind Google Translate Services - AI Summary
During the initial days, Google Translate was launched with Phrase-Based Machine Translation as the key algorithm. The main improvement in the translation systems was achieved with the introduction of Google Neural Machine Translation or GNMT . With Translatotron, Google demonstrated that a single sequence-to-sequence model can directly translate speech from one language into speech in another language, without the need for intermediate text representation, unlike cascaded systems. Translatotron is claimed to be the first end-to-end model that could directly translate speech from one language into speech in another language and was also able to retain the source speaker's voice in the translated speech. Stay updated on last news about Artificial Intelligence.
WordBias: An Interactive Visual Tool for Discovering Intersectional Biases Encoded in Word Embeddings
Ghai, Bhavya, Hoque, Md Naimul, Mueller, Klaus
Intersectional bias is a bias caused by an overlap of multiple social factors like gender, sexuality, race, disability, religion, etc. A recent study has shown that word embedding models can be laden with biases against intersectional groups like African American females, etc. The first step towards tackling such intersectional biases is to identify them. However, discovering biases against different intersectional groups remains a challenging task. In this work, we present WordBias, an interactive visual tool designed to explore biases against intersectional groups encoded in static word embeddings. Given a pretrained static word embedding, WordBias computes the association of each word along different groups based on race, age, etc. and then visualizes them using a novel interactive interface. Using a case study, we demonstrate how WordBias can help uncover biases against intersectional groups like Black Muslim Males, Poor Females, etc. encoded in word embedding. In addition, we also evaluate our tool using qualitative feedback from expert interviews. The source code for this tool can be publicly accessed for reproducibility at github.com/bhavyaghai/WordBias.
IOT: Instance-wise Layer Reordering for Transformer Structures
Zhu, Jinhua, Wu, Lijun, Xia, Yingce, Xie, Shufang, Qin, Tao, Zhou, Wengang, Li, Houqiang, Liu, Tie-Yan
With sequentially stacked self-attention, (optional) encoder-decoder attention, and feed-forward layers, Transformer achieves big success in natural language processing (NLP), and many variants have been proposed. Currently, almost all these models assume that the layer order is fixed and kept the same across data samples. We observe that different data samples actually favor different orders of the layers. Based on this observation, in this work, we break the assumption of the fixed layer order in the Transformer and introduce instance-wise layer reordering into the model structure. Our Instance-wise Ordered Transformer (IOT) can model variant functions by reordered layers, which enables each sample to select the better one to improve the model performance under the constraint of almost the same number of parameters. To achieve this, we introduce a light predictor with negligible parameter and inference cost to decide the most capable and favorable layer order for any input sequence. Experiments on 3 tasks (neural machine translation, abstractive summarization, and code generation) and 9 datasets demonstrate consistent improvements of our method. We further show that our method can also be applied to other architectures beyond Transformer. Our code is released at Github.
An empirical analysis of phrase-based and neural machine translation
Two popular types of machine translation (MT) are phrase-based and neural machine translation systems. Both of these types of systems are composed of multiple complex models or layers. Each of these models and layers learns different linguistic aspects of the source language. However, for some of these models and layers, it is not clear which linguistic phenomena are learned or how this information is learned. For phrase-based MT systems, it is often clear what information is learned by each model, and the question is rather how this information is learned, especially for its phrase reordering model. For neural machine translation systems, the situation is even more complex, since for many cases it is not exactly clear what information is learned and how it is learned. To shed light on what linguistic phenomena are captured by MT systems, we analyze the behavior of important models in both phrase-based and neural MT systems. We consider phrase reordering models from phrase-based MT systems to investigate which words from inside of a phrase have the biggest impact on defining the phrase reordering behavior. Additionally, to contribute to the interpretability of neural MT systems we study the behavior of the attention model, which is a key component in neural MT systems and the closest model in functionality to phrase reordering models in phrase-based systems. The attention model together with the encoder hidden state representations form the main components to encode source side linguistic information in neural MT. To this end, we also analyze the information captured in the encoder hidden state representations of a neural MT system. We investigate the extent to which syntactic and lexical-semantic information from the source side is captured by hidden state representations of different neural MT architectures.
AbbVie Accelerates Natural Language Processing
AbbVie is a research-based biopharmaceutical company that serves more than 30 million patients in 175 countries. With its global scale, AbbVie partnered with Intel to optimize processes for its more than 47,000 employees. This whitepaper highlights two use cases that are important to AbbVie's research. The first is Abbelfish Machine Translation, AbbVie's language translation service based on the Transformer NLP model, that leverages second-generation Intel Xeon Scalable processors and the Intel Optimization for TensorFlow with Intel oneAPI Deep Neural Network Library (oneDNN). AbbVie was able to achieve a 1.9x improvement in throughput for Abbelfish language translation using Intel Optimization for TensorFlow 1.15 with oneAPI Deep Neural Network Library when compared to TensorFlow 1.15 without oneDNN.1
LTI Value Cast: Reshaping Remote Work & Meetings with Linguistic AI
The abrupt move to an almost exclusive home-based working environment at the start of the Covid19 crisis resulted in a new work ethic: back-to-back web calls from your living room or kitchen, across various web conferencing systems, and requiring to handle multilanguage interactions. The onsite in-person meetings were facilitated by the help of interpreters, traditional meeting notes redaction, and lengthy post-meeting analysis and review. In the new virtual environment, it is up to advanced language technologies powered by Artificial Intelligence to solve these issues. Speech to text, neural machine translation and hybrid natural language understanding will automate complex human tasks and replace the more repetitive processes, creating a โdigital work companion" that can assist in the next fast-paced remote working environment challenges.
OmniNet: Omnidirectional Representations from Transformers
Tay, Yi, Dehghani, Mostafa, Aribandi, Vamsi, Gupta, Jai, Pham, Philip, Qin, Zhen, Bahri, Dara, Juan, Da-Cheng, Metzler, Donald
This paper proposes Omnidirectional Representations from Transformers (OmniNet). In OmniNet, instead of maintaining a strictly horizontal receptive field, each token is allowed to attend to all tokens in the entire network. This process can also be interpreted as a form of extreme or intensive attention mechanism that has the receptive field of the entire width and depth of the network. To this end, the omnidirectional attention is learned via a meta-learner, which is essentially another self-attention based model. In order to mitigate the computationally expensive costs of full receptive field attention, we leverage efficient self-attention models such as kernel-based (Choromanski et al.), low-rank attention (Wang et al.) and/or Big Bird (Zaheer et al.) as the meta-learner. Extensive experiments are conducted on autoregressive language modeling (LM1B, C4), Machine Translation, Long Range Arena (LRA), and Image Recognition. The experiments show that OmniNet achieves considerable improvements across these tasks, including achieving state-of-the-art performance on LM1B, WMT'14 En-De/En-Fr, and Long Range Arena. Moreover, using omnidirectional representation in Vision Transformers leads to significant improvements on image recognition tasks on both few-shot learning and fine-tuning setups.
AI Incident Database Spotlights Worst Machine Translation Fails
In the ongoing popular (albeit shallow) debate pitting human translators against machine translation (MT), one constant is the question of quality -- how to define it, how to measure it, and how to improve it. Now, a new website, the AI Incident Database (AIID), aims to quantify the risks presented, and actual harm caused, by AI. Sean McGregor, ML architect at Syntiant and developer of the AIID, described the "collective memory of [AI systems'] failings" in a November 2020 paper. As McGregor explained, the AIID is a project of the Partnership on AI (PAI), an organization funded by tech companies and governed by a board comprising corporate partners and non-profits. The AIID is modeled on incident databases in other industries, namely aviation and cybersecurity, which promote transparency.
CURE: Code-Aware Neural Machine Translation for Automatic Program Repair
Jiang, Nan, Lutellier, Thibaud, Tan, Lin
Automatic program repair (APR) is crucial to improve software reliability. Recently, neural machine translation (NMT) techniques have been used to fix software bugs automatically. While promising, these approaches have two major limitations. Their search space often does not contain the correct fix, and their search strategy ignores software knowledge such as strict code syntax. Due to these limitations, existing NMT-based techniques underperform the best template-based approaches. We propose CURE, a new NMT-based APR technique with three major novelties. First, CURE pre-trains a programming language (PL) model on a large software codebase to learn developer-like source code before the APR task. Second, CURE designs a new code-aware search strategy that finds more correct fixes by focusing on compilable patches and patches that are close in length to the buggy code. Finally, CURE uses a subword tokenization technique to generate a smaller search space that contains more correct fixes. Our evaluation on two widely-used benchmarks shows that CURE correctly fixes 57 Defects4J bugs and 26 QuixBugs bugs, outperforming all existing APR techniques on both benchmarks.
Even Small Companies Use AI, Machine Learning
Data, technology, and people are at hand to make artificial intelligence and machine learning available to all commerce companies. To be certain, artificial intelligence and its sub-field, machine learning, have gone through cycles of inflated expectations followed by disappointments. For example, in the 1950s and 1960s, the United States government funded research for the machine translation of languages. The hope was that Russian-language documents could be instantly translated to English. But by 1966, a report from the Automatic Language Processing Advisory Committee, a government team of seven scientists, essentially killed machine translation research in the U.S. for about a decade.