AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Direct Speech-to-speech Translation without Textual Annotation using Bottleneck Features

Zhang, Junhui, Pan, Junjie, Yin, Xiang, Ma, Zejun

arXiv.org Artificial IntelligenceDec-12-2022

Speech-to-speech translation directly translates a speech utterance to another between different languages, and has great potential in tasks such as simultaneous interpretation. State-of-art models usually contains an auxiliary module for phoneme sequences prediction, and this requires textual annotation of the training dataset. We propose a direct speech-to-speech translation model which can be trained without any textual annotation or content information. Instead of introducing an auxiliary phoneme prediction task in the model, we propose to use bottleneck features as intermediate training objectives for our model to ensure the translation performance of the system. Experiments on Mandarin-Cantonese speech translation demonstrate the feasibility of the proposed approach and the performance can match a cascaded system with respect of translation and synthesis qualities.

machine learning, natural language, translation, (19 more...)

arXiv.org Artificial Intelligence

2212.05805

Country: Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Novel Chinese Dialect TTS Frontend with Non-Autoregressive Neural Machine Translation

Zhang, Junhui, Bao, Wudi, Pan, Junjie, Yin, Xiang, Ma, Zejun

arXiv.org Artificial IntelligenceDec-12-2022

Chinese dialects are different variations of Chinese and can be considered as different languages in the same language family with Mandarin. Though they all use Chinese characters, the pronunciations, grammar and idioms can vary significantly, and even local speakers may find it hard to input correct written forms of dialect. Besides, using Mandarin text as text-to-speech inputs would generate speech with poor naturalness. In this paper, we propose a novel Chinese dialect TTS frontend with a translation module, which converts Mandarin text into dialectic expressions to improve the intelligibility and naturalness of synthesized speech. A non-autoregressive neural machine translation model with various tricks is proposed for the translation task. It is the first known work to incorporate translation with TTS frontend. Experiments on Cantonese show the proposed model improves 2.56 BLEU and TTS improves 0.27 MOS with Mandarin inputs.

artificial intelligence, frontend, natural language, (13 more...)

arXiv.org Artificial Intelligence

2206.04922

Country:

Asia > China > Shanghai > Shanghai (0.05)
Africa > Middle East > Egypt > Giza Governorate > Giza (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

End-to-End Speech Translation of Arabic to English Broadcast News

Bougares, Fethi, Jouili, Salim

arXiv.org Artificial IntelligenceDec-11-2022

Speech translation (ST) is the task of directly translating acoustic speech signals in a source language into text in a foreign language. ST task has been addressed, for a long time, using a pipeline approach with two modules : first an Automatic Speech Recognition (ASR) in the source language followed by a text-to-text Machine translation (MT). In the past few years, we have seen a paradigm shift towards the end-to-end approaches using sequence-to-sequence deep neural network models. This paper presents our efforts towards the development of the first Broadcast News end-to-end Arabic to English speech translation system. Starting from independent ASR and MT LDC releases, we were able to identify about 92 hours of Arabic audio recordings for which the manual transcription was also translated into English at the segment level. These data was used to train and compare pipeline and end-to-end speech translation systems under multiple scenarios including transfer learning and data augmentation techniques.

machine learning, natural language, translation, (22 more...)

arXiv.org Artificial Intelligence

2212.05479

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(8 more...)

Genre: Research Report (0.50)

Industry: Media > News (0.62)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

SpeechLMScore: Evaluating speech generation using speech language model

Maiti, Soumi, Peng, Yifan, Saeki, Takaaki, Watanabe, Shinji

arXiv.org Artificial IntelligenceDec-8-2022

While human evaluation is the most reliable metric for evaluating speech generation systems, it is generally costly and time-consuming. Previous studies on automatic speech quality assessment address the problem by predicting human evaluation scores with machine learning models. However, they rely on supervised learning and thus suffer from high annotation costs and domain-shift problems. We propose SpeechLMScore, an unsupervised metric to evaluate generated speech using a speech-language model. SpeechLMScore computes the average log-probability of a speech signal by mapping it into discrete tokens and measures the average probability of generating the sequence of tokens. Therefore, it does not require human annotation and is a highly scalable framework. Evaluation results demonstrate that the proposed metric shows a promising correlation with human evaluation scores on different speech generation tasks including voice conversion, text-to-speech, and speech enhancement.

machine learning, natural language, speechlmscore, (19 more...)

arXiv.org Artificial Intelligence

2212.04559

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)

Add feedback

ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation

Li, Zhaocong, Liu, Xuebo, Wong, Derek F., Chao, Lidia S., Zhang, Min

arXiv.org Artificial IntelligenceDec-8-2022

Transfer learning is a simple and powerful method that can be used to boost model performance of low-resource neural machine translation (NMT). Existing transfer learning methods for NMT are static, which simply transfer knowledge from a parent model to a child model once via parameter initialization. In this paper, we propose a novel transfer learning method for NMT, namely ConsistTL, which can continuously transfer knowledge from the parent model during the training of the child model. Specifically, for each training instance of the child model, ConsistTL constructs the semantically-equivalent instance for the parent model and encourages prediction consistency between the parent and child for this instance, which is equivalent to the child model learning each instance under the guidance of the parent model. Experimental results on five low-resource NMT tasks demonstrate that ConsistTL results in significant improvements over strong transfer learning baselines, with a gain up to 1.7 BLEU over the existing back-translation model on the widely-used WMT17 Turkish-English benchmark. Further analysis reveals that ConsistTL can improve the inference calibration of the child model. Code and scripts are freely available at https://github.com/NLP2CT/ConsistTL.

computational linguistic, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2212.04262

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.05)
Europe > Italy > Tuscany > Florence (0.05)
(21 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning an artificial language for knowledge-sharing in multilingual translation

AIHubDec-7-2022, 12:52:52 GMT

In their recent paper Learning an artificial language for knowledge-sharing in multilingual translation, Danni Liu and Jan Niehues investigate multilingual neural machine translation models. Here, they tell us more about the main contributions of their research. Neural machine translation (NMT) is the backbone of many automatic translation platforms nowadays. The second characteristic is especially useful in low-resource conditions, where training data (translated sentence pairs) are limited. To enable knowledge-sharing between languages, and to improve translation quality on low-resource translation directions, a precondition is the ability to capture common features between languages.

artificial language, multilingual translation, translation, (15 more...)

AIHub

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

A simple introduction to Natural Language Processing (NLP)

#artificialintelligenceDec-7-2022, 10:40:27 GMT

Natural Language Processing (NLP) is a crucial field within the realm of artificial intelligence. It involves the ability of computers to analyze, understand, and generate human language. This technology has a wide range of applications, from voice assistants to language translation and text analysis. The importance of NLP is evident in our daily lives. We often use voice assistants to set reminders, answer questions, and even make phone calls.

natural language processing, nlp, simple introduction, (4 more...)

#artificialintelligence

Industry: Health & Medicine (0.37)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.60)

Add feedback

General and Domain Adaptive Chinese Spelling Check with Error Consistent Pretraining

Lv, Qi, Cao, Ziqiang, Geng, Lei, Ai, Chunhui, Yan, Xu, Fu, Guohong

arXiv.org Artificial IntelligenceDec-7-2022

The lack of label data is one of the significant bottlenecks for Chinese Spelling Check (CSC). Existing researches use the method of automatic generation by exploiting unlabeled data to expand the supervised corpus. However, there is a big gap between the real input scenario and automatic generated corpus. Thus, we develop a competitive general speller ECSpell which adopts the Error Consistent masking strategy to create data for pretraining. This error consistency masking strategy is used to specify the error types of automatically generated sentences which is consistent with real scene. The experimental result indicates our model outperforms previous state-of-the-art models on the general benchmark. Moreover, spellers often work within a particular domain in real life. Due to lots of uncommon domain terms, experiments on our built domain specific datasets show that general models perform terribly. Inspired by the common practice of input methods, we propose to add an alterable user dictionary to handle the zero-shot domain adaption problem. Specifically, we attach a User Dictionary guided inference module (UD) to a general token classification based speller. Our experiments demonstrate that ECSpell$^{UD}$, namely ECSpell combined with UD, surpasses all the other baselines largely, even approaching the performance on the general benchmark.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3564271

2203.10929

Country:

Asia > China > Hubei Province > Wuhan (0.04)
Asia > Japan > Honshū > Chūbu > Aichi Prefecture > Nagoya (0.04)
Asia > China > Beijing > Beijing (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information

Ding, Fenglin, Wan, Genshun, Li, Pengcheng, Pan, Jia, Liu, Cong

arXiv.org Artificial IntelligenceDec-7-2022

Multilingual end-to-end models have shown great improvement over monolingual systems. With the development of pre-training methods on speech, self-supervised multilingual speech representation learning like XLSR has shown success in improving the performance of multilingual automatic speech recognition (ASR). However, similar to the supervised learning, multilingual pre-training may also suffer from language interference and further affect the application of multilingual system. In this paper, we introduce several techniques for improving self-supervised multilingual pre-training by leveraging auxiliary language information, including the language adversarial training, language embedding and language adaptive training during the pre-training stage. We conduct experiments on a multilingual ASR task consisting of 16 languages. Our experimental results demonstrate 14.3% relative gain over the standard XLSR model, and 19.8% relative gain over the no pre-training multilingual model.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2212.03476

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.72)

Add feedback

M3ST: Mix at Three Levels for Speech Translation

Cheng, Xuxin, Dong, Qianqian, Yue, Fengpeng, Ko, Tom, Wang, Mingxuan, Zou, Yuexian

arXiv.org Artificial IntelligenceDec-7-2022

How to solve the data scarcity problem for end-to-end speech-to-text translation (ST)? It's well known that data augmentation is an efficient method to improve performance for many tasks by enlarging the dataset. In this paper, we propose Mix at three levels for Speech Translation (M^3ST) method to increase the diversity of the augmented training corpus. Specifically, we conduct two phases of fine-tuning based on a pre-trained model using external machine translation (MT) data. In the first stage of fine-tuning, we mix the training corpus at three levels, including word level, sentence level and frame level, and fine-tune the entire model with mixed data. At the second stage of fine-tuning, we take both original speech sequences and original text sequences in parallel into the model to fine-tune the network, and use Jensen-Shannon divergence to regularize their outputs. Experiments on MuST-C speech translation benchmark and analysis show that M^3ST outperforms current strong baselines and achieves state-of-the-art results on eight directions with an average BLEU of 29.9.

artificial intelligence, natural language, translation, (18 more...)

arXiv.org Artificial Intelligence

2212.03657

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback