AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Efficient Speech Translation with Dynamic Latent Perceivers

Tsiamas, Ioannis, Gállego, Gerard I., Fonollosa, José A. R., Costa-jussà, Marta R.

arXiv.org Artificial IntelligenceMar-14-2023

Transformers have been the dominant architecture for Speech Translation in recent years, achieving significant improvements in translation quality. Since speech signals are longer than their textual counterparts, and due to the quadratic complexity of the Transformer, a down-sampling step is essential for its adoption in Speech Translation. Instead, in this research, we propose to ease the complexity by using a Perceiver encoder to map the speech inputs to a fixed-length latent representation. Furthermore, we introduce a novel way of training Perceivers, with Dynamic Latent Access (DLA), unlocking larger latent spaces without any additional computational overhead. Speech-to-Text Perceivers with DLA can match the performance of Transformer baselines across three language pairs in MuST-C. Finally, a DLA-trained model is easily adaptable to DLA at inference, and can be flexibly deployed with various computational budgets, without significant drops in translation quality.

artificial intelligence, machine translation, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.16264

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

AMOM: Adaptive Masking over Masking for Conditional Masked Language Model

Xiao, Yisheng, Xu, Ruiyang, Wu, Lijun, Li, Juntao, Qin, Tao, Liu, Yan-Tie, Zhang, Min

arXiv.org Artificial IntelligenceMar-13-2023

Transformer-based autoregressive (AR) methods have achieved appealing performance for varied sequence-to-sequence generation tasks, e.g., neural machine translation, summarization, and code generation, but suffer from low inference efficiency. To speed up the inference stage, many non-autoregressive (NAR) strategies have been proposed in the past few years. Among them, the conditional masked language model (CMLM) is one of the most versatile frameworks, as it can support many different sequence generation scenarios and achieve very competitive performance on these tasks. In this paper, we further introduce a simple yet effective adaptive masking over masking strategy to enhance the refinement capability of the decoder and make the encoder optimization easier. Experiments on \textbf{3} different tasks (neural machine translation, summarization, and code generation) with \textbf{15} datasets in total confirm that our proposed simple method achieves significant performance improvement over the strong CMLM model. Surprisingly, our proposed model yields state-of-the-art performance on neural machine translation (\textbf{34.62} BLEU on WMT16 EN$\to$RO, \textbf{34.82} BLEU on WMT16 RO$\to$EN, and \textbf{34.84} BLEU on IWSLT De$\to$En) and even better performance than the \textbf{AR} Transformer on \textbf{7} benchmark datasets with at least \textbf{2.2$\times$} speedup. Our code is available at GitHub.

artificial intelligence, machine translation, natural language, (14 more...)

arXiv.org Artificial Intelligence

2303.07457

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Rethinking the Reasonability of the Test Set for Simultaneous Machine Translation

Liu, Mengge, Zhang, Wen, Li, Xiang, Luan, Jian, Wang, Bin, Guo, Yuhang, Chen, Shuoying

arXiv.org Artificial IntelligenceMar-13-2023

Simultaneous machine translation (SimulMT) models start translation before the end of the source sentence, making the translation monotonically aligned with the source sentence. However, the general full-sentence translation test set is acquired by offline translation of the entire source sentence, which is not designed for SimulMT evaluation, making us rethink whether this will underestimate the performance of SimulMT models. In this paper, we manually annotate a monotonic test set based on the MuST-C English-Chinese test set, denoted as SiMuST-C. Our human evaluation confirms the acceptability of our annotated test set. Evaluations on three different SimulMT models verify that the underestimation problem can be alleviated on our test set. Further experiments show that finetuning on an automatically extracted monotonic training set improves SimulMT models by up to 3 BLEU points.

artificial intelligence, natural language, translation, (17 more...)

arXiv.org Artificial Intelligence

2303.00969

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Named Entity Detection and Injection for Direct Speech Translation

Gaido, Marco, Tang, Yun, Kulikov, Ilia, Huang, Rongqing, Gong, Hongyu, Inaguma, Hirofumi

arXiv.org Artificial IntelligenceMar-11-2023

In a sentence, certain words are critical for its semantic. Among them, named entities (NEs) are notoriously challenging for neural models. Despite their importance, their accurate handling has been neglected in speech-to-text (S2T) translation research, and recent work has shown that S2T models perform poorly for locations and notably person names, whose spelling is challenging unless known in advance. In this work, we explore how to leverage dictionaries of NEs known to likely appear in a given context to improve S2T model outputs. Our experiments show that we can reliably detect NEs likely present in an utterance starting from S2T encoder outputs. Indeed, we demonstrate that the current detection quality is sufficient to improve NE accuracy in the translation with a 31% reduction in person name errors.

artificial intelligence, natural language, utterance, (18 more...)

arXiv.org Artificial Intelligence

2210.11981

Country:

Asia > Malaysia (0.28)
Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
Europe > Portugal > Lisbon > Lisbon (0.04)
(11 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

A Kind Introduction to Lexical and Grammatical Aspect, with a Survey of Computational Approaches

Friedrich, Annemarie, Xue, Nianwen, Palmer, Alexis

arXiv.org Artificial IntelligenceMar-10-2023

Aspectual meaning refers to how the internal temporal structure of situations is presented. This includes whether a situation is described as a state or as an event, whether the situation is finished or ongoing, and whether it is viewed as a whole or with a focus on a particular phase. This survey gives an overview of computational approaches to modeling lexical and grammatical aspect along with intuitive explanations of the necessary linguistic concepts and terminology. In particular, we describe the concepts of stativity, telicity, habituality, perfective and imperfective, as well as influential inventories of eventuality and situation types. We argue that because aspect is a crucial component of semantics, especially when it comes to reporting the temporal structure of situations in a precise way, future NLP approaches need to be able to handle and evaluate it systematically in order to achieve human-level language understanding.

computational linguistic, large language model, machine learning, (23 more...)

arXiv.org Artificial Intelligence

2208.09012

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(37 more...)

Genre: Overview (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(7 more...)

Add feedback

Is In-hospital Meta-information Useful for Abstractive Discharge Summary Generation?

Ando, Kenichiro, Komachi, Mamoru, Okumura, Takashi, Horiguchi, Hiromasa, Matsumoto, Yuji

arXiv.org Artificial IntelligenceMar-10-2023

During the patient's hospitalization, the physician must record daily observations of the patient and summarize them into a brief document called "discharge summary" when the patient is discharged. Automated generation of discharge summary can greatly relieve the physicians' burden, and has been addressed recently in the research community. Most previous studies of discharge summary generation using the sequence-to-sequence architecture focus on only inpatient notes for input. However, electric health records (EHR) also have rich structured metadata (e.g., hospital, physician, disease, length of stay, etc.) that might be useful. This paper investigates the effectiveness of medical meta-information for summarization tasks. We obtain four types of meta-information from the EHR systems and encode each meta-information into a sequence-to-sequence model. Using Japanese EHRs, meta-information encoded models increased ROUGE-1 by up to 4.45 points and BERTScore by 3.77 points over the vanilla Longformer. Also, we found that the encoded meta-information improves the precisions of its related terms in the outputs. Our results showed the benefit of the use of medical meta-information.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TAAI57707.2022.00034

2303.06002

Country: Asia > Japan (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Health Care Technology > Medical Record (0.90)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

XLM -- Enhancing BERT for Cross-lingual Language Model

#artificialintelligenceMar-9-2023, 22:38:44 GMT

Attention models, and BERT in particular, have achieved promising results in Natural Language Processing, in both classification and translation tasks. A new paper by Facebook AI, named XLM, presents an improved version of BERT to achieve state-of-the-art results in both types of tasks. XLM uses a known pre-processing technique (BPE) and a dual-language training mechanism with BERT in order to learn relations between words in different languages. The model outperforms other models in a cross-lingual classification task (sentence entailment in 15 languages) and significantly improves machine translation when a pre-trained model is used for initialization of the translation model. XLM is based on several key concepts: Transformers, invented in 2017, introduced an attention mechanism that processes the entire text input simultaneously to learn contextual relations between words (or sub-words).

bert, transformer, translation model, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.51)

Add feedback

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition

Cheng, Xize, Li, Linjun, Jin, Tao, Huang, Rongjie, Lin, Wang, Wang, Zehan, Liu, Huangdai, Wang, Ye, Yin, Aoxiong, Zhao, Zhou

arXiv.org Artificial IntelligenceMar-9-2023

Multi-media communications facilitate global interaction among people. However, despite researchers exploring cross-lingual translation techniques such as machine translation and audio speech translation to overcome language barriers, there is still a shortage of cross-lingual studies on visual speech. This lack of research is mainly due to the absence of datasets containing visual speech and translated text pairs. In this paper, we present \textbf{AVMuST-TED}, the first dataset for \textbf{A}udio-\textbf{V}isual \textbf{Mu}ltilingual \textbf{S}peech \textbf{T}ranslation, derived from \textbf{TED} talks. Nonetheless, visual speech is not as distinguishable as audio speech, making it difficult to develop a mapping from source speech phonemes to the target language text. To address this issue, we propose MixSpeech, a cross-modality self-learning framework that utilizes audio speech to regularize the training of visual speech tasks. To further minimize the cross-modality gap and its impact on knowledge transfer, we suggest adopting mixed speech, which is created by interpolating audio and visual streams, along with a curriculum learning strategy to adjust the mixing ratio as needed. MixSpeech enhances speech translation in noisy environments, improving BLEU scores for four languages on AVMuST-TED by +1.4 to +4.2. Moreover, it achieves state-of-the-art performance in lip reading on CMLR (11.1\%), LRS2 (25.5\%), and LRS3 (28.0\%).

artificial intelligence, machine translation, natural language, (17 more...)

arXiv.org Artificial Intelligence

2303.05309

Country:

Europe > Belgium (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Online (0.48)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Syntax and Domain Aware Model for Unsupervised Program Translation

Liu, Fang, Li, Jia, Zhang, Li

arXiv.org Artificial IntelligenceMar-9-2023

There is growing interest in software migration as the development of software and society. Manually migrating projects between languages is error-prone and expensive. In recent years, researchers have begun to explore automatic program translation using supervised deep learning techniques by learning from large-scale parallel code corpus. However, parallel resources are scarce in the programming language domain, and it is costly to collect bilingual data manually. To address this issue, several unsupervised programming translation systems are proposed. However, these systems still rely on huge monolingual source code to train, which is very expensive. Besides, these models cannot perform well for translating the languages that are not seen during the pre-training procedure. In this paper, we propose SDA-Trans, a syntax and domain-aware model for program translation, which leverages the syntax structure and domain knowledge to enhance the cross-lingual transfer ability. SDA-Trans adopts unsupervised training on a smaller-scale corpus, including Python and Java monolingual programs. The experimental results on function translation tasks between Python, Java, and C++ show that SDA-Trans outperforms many large-scale pre-trained models, especially for unseen language translation.

machine learning, natural language, translation, (20 more...)

arXiv.org Artificial Intelligence

2302.03908

Country:

North America > United States (0.14)
Asia > China > Beijing > Beijing (0.04)
Oceania > Australia (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unsupervised Language agnostic WER Standardization

Guha, Satarupa, Ambavat, Rahul, Gupta, Ankur, Gupta, Manish, Mehta, Rupeshkumar

arXiv.org Artificial IntelligenceMar-9-2023

Word error rate (WER) is a standard metric for the evaluation of Automated Speech Recognition (ASR) systems. However, WER fails to provide a fair evaluation of human perceived quality in presence of spelling variations, abbreviations, or compound words arising out of agglutination. Multiple spelling variations might be acceptable based on locale/geography, alternative abbreviations, borrowed words, and transliteration of code-mixed words from a foreign language to the target language script. Similarly, in case of agglutination, often times the agglutinated, as well as the split forms, are acceptable. Previous work handled this problem by using manually identified normalization pairs and applying them to both the transcription and the hypothesis before computing WER. In this paper, we propose an automatic WER normalization system consisting of two modules: spelling normalization and segmentation normalization. The proposed system is unsupervised and language agnostic, and therefore scalable. Experiments with ASR on 35K utterances across four languages yielded an average WER reduction of 13.28%. Human judgements of these automatically identified normalization pairs show that our WER-normalized evaluation is highly consistent with the perceived quality of ASR output.

machine learning, natural language, variation, (20 more...)

arXiv.org Artificial Intelligence

2303.05046

Country:

North America > United States (0.04)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.70)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Add feedback