Machine Translation
Google Translate adds support for 24 new languages
Google is adding support for 24 new languages to its Translate tool, the company announced today during its I/O 2022 developer conference. Among the newly available languages are Sanskrit, Tsongae and Sorani Kurdish. One of the new additions, Assamese, is used by approximately 25 million people in Northeast India. Another, Dhivehi, is spoken by about 300,000 people in the Maldives. According to Google CEO Sundar Pichai, the expansion allows the company to cover languages spoken by more than 300 million people and brings the total number of languages supported by Translate to 133.
Efficient yet Competitive Speech Translation: FBK@IWSLT2022
Gaido, Marco, Papi, Sara, Fucci, Dennis, Fiameni, Giuseppe, Negri, Matteo, Turchi, Marco
The primary goal of this FBK's systems submission to the IWSLT 2022 offline and simultaneous speech translation tasks is to reduce model training costs without sacrificing translation quality. As such, we first question the need of ASR pre-training, showing that it is not essential to achieve competitive results. Second, we focus on data filtering, showing that a simple method that looks at the ratio between source and target characters yields a quality improvement of 1 BLEU. Third, we compare different methods to reduce the detrimental effect of the audio segmentation mismatch between training data manually segmented at sentence level and inference data that is automatically segmented. Towards the same goal of training cost reduction, we participate in the simultaneous task with the same model trained for offline ST. The effectiveness of our lightweight training strategy is shown by the high score obtained on the MuST-C en-de corpus (26.7 BLEU) and is confirmed in high-resource data conditions by a 1.6 BLEU improvement on the IWSLT2020 test set over last year's winning system.
Learn To Remember: Transformer with Recurrent Memory for Document-Level Machine Translation
Feng, Yukun, Li, Feng, Song, Ziang, Zheng, Boyuan, Koehn, Philipp
The Transformer architecture has led to significant gains in machine translation. However, most studies focus on only sentence-level translation without considering the context dependency within documents, leading to the inadequacy of document-level coherence. Some recent research tried to mitigate this issue by introducing an additional context encoder or translating with multiple sentences or even the entire document. Such methods may lose the information on the target side or have an increasing computational complexity as documents get longer. To address such problems, we introduce a recurrent memory unit to the vanilla Transformer, which supports the information exchange between the sentence and previous context. The memory unit is recurrently updated by acquiring information from sentences, and passing the aggregated knowledge back to subsequent sentence states. We follow a two-stage training strategy, in which the model is first trained at the sentence level and then finetuned for document-level translation. We conduct experiments on three popular datasets for document-level machine translation and our model has an average improvement of 0.91 s-BLEU over the sentence-level baseline. We also achieve state-of-the-art results on TED and News, outperforming the previous work by 0.36 s-BLEU and 1.49 d-BLEU on average.
Council Post: Automation Is Here: Ways AI And ML Are Transforming Digital Publishing
According to Statista, digital publishing generates worldwide revenue of $22.05 billion. Globally, countries that have access to digital media have witnessed a sharp rise in its popularity. However, with global accessibility comes the challenge of producing high-quality content consistently in large volumes. Additionally, with the rise in voice-based and image searches, content discoverability is the need of the hour. Artificial intelligence (AI) can help in this endeavor.
FFCI: A Framework for Interpretable Automatic Evaluation of Summarization
Koto, Fajri (University of Melbourne) | Baldwin, Timothy (University of Melbourne) | Lau, Jey Han (University of Melbourne)
In this paper, we propose FFCI, a framework for fine-grained summarization evaluation that comprises four elements: faithfulness (degree of factual consistency with the source), focus (precision of summary content relative to the reference), coverage (recall of summary content relative to the reference), and inter-sentential coherence (document fluency between adjacent sentences). We construct a novel dataset for focus, coverage, and inter-sentential coherence, and develop automatic methods for evaluating each of the four dimensions of FFCI based on cross-comparison of evaluation metrics and model-based evaluation methods, including question answering (QA) approaches, semantic textual similarity (STS), next-sentence prediction (NSP), and scores derived from 19 pre-trained language models. We then apply the developed metrics in evaluating a broad range of summarization models across two datasets, with some surprising findings.
Neighbors Are Not Strangers: Improving Non-Autoregressive Translation under Low-Frequency Lexical Constraints
Zeng, Chun, Chen, Jiangjie, Zhuang, Tianyi, Xu, Rui, Yang, Hao, Qin, Ying, Tao, Shimin, Xiao, Yanghua
However, current autoregressive approaches suffer from high latency. In this paper, we focus on non-autoregressive translation (NAT) for this problem for its efficiency advantage. We identify that current constrained NAT models, which are based on iterative editing, do not handle low-frequency constraints well. To this end, we propose a plug-in algorithm for this line of work, i.e., Aligned Constrained Training (ACT), which alleviates this problem by familiarizing the model with the source-side context of the constraints. Experiments on the general and domain datasets show that our model improves over the backbone constrained NAT model in constraint preservation and translation quality, especially for rare constraints.
Attention Mechanism with Energy-Friendly Operations
Wan, Yu, Yang, Baosong, Liu, Dayiheng, Xiao, Rong, Wong, Derek F., Zhang, Haibo, Chen, Boxing, Chao, Lidia S.
Attention mechanism has become the dominant module in natural language processing models. It is computationally intensive and depends on massive power-hungry multiplications. In this paper, we rethink variants of attention mechanism from the energy consumption aspects. After reaching the conclusion that the energy costs of several energy-friendly operations are far less than their multiplication counterparts, we build a novel attention model by replacing multiplications with either selective operations or additions. Empirical results on three machine translation tasks demonstrate that the proposed model, against the vanilla one, achieves competitable accuracy while saving 99\% and 66\% energy during alignment calculation and the whole attention procedure. Code is available at: https://github.com/NLP2CT/E-Att.
UniTE: Unified Translation Evaluation
Wan, Yu, Liu, Dayiheng, Yang, Baosong, Zhang, Haibo, Chen, Boxing, Wong, Derek F., Chao, Lidia S.
Translation quality evaluation plays a crucial role in machine translation. According to the input format, it is mainly separated into three tasks, i.e., reference-only, source-only and source-reference-combined. Recent methods, despite their promising results, are specifically designed and optimized on one of them. This limits the convenience of these methods, and overlooks the commonalities among tasks. In this paper, we propose UniTE, which is the first unified framework engaged with abilities to handle all three evaluation tasks. Concretely, we propose monotonic regional attention to control the interaction among input segments, and unified pretraining to better adapt multi-task learning. We testify our framework on WMT 2019 Metrics and WMT 2020 Quality Estimation benchmarks. Extensive analyses show that our \textit{single model} can universally surpass various state-of-the-art or winner methods across tasks. Both source code and associated models are available at https://github.com/NLP2CT/UniTE.
Raising Robovoices
In a critical episode of The Mandalorian, a TV series set in the Star Wars universe, a mysterious Jedi fights his way through a horde of evil robots. As the heroes of the show wait anxiously to learn the identity of their cloaked savior, he lowers his hood, and--spoiler alert-- they meet a young Luke Skywalker. Actually, what we see is an animated, de-aged version of the Jedi. Then Luke speaks, in a voice that sounds very much like the 1980s-era rendition of the character, thanks to the use of an advanced machine learning model developed by the voice technology startup Respeecher. "No one noticed that it was generated by a machine," says Dmytro Bielievtsov, chief technology officer at Respeecher.
Multilingual Machine Translation: Deep Analysis of Language-Specific Encoder-Decoders
Escolano, Carlos (Universitat Politècnica de Catalunya) | R. Costa-jussà, Marta | R. Fonollosa, José A. (Universitat Politècnica de Catalunya)
State-of-the-art multilingual machine translation relies on a shared encoder-decoder. In this paper, we propose an alternative approach based on language-specific encoder-decoders, which can be easily extended to new languages by learning their corresponding modules. To establish a common interlingua representation, we simultaneously train N initial languages. Our experiments show that the proposed approach improves over the shared encoder-decoder for the initial languages and when adding new languages, without the need to retrain the remaining modules. All in all, our work closes the gap between shared and language-specific encoder-decoders, advancing toward modular multilingual machine translation systems that can be flexibly extended in lifelong learning settings.