AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Speech-to-Speech Translation with Discrete-Unit-Based Style Transfer

Wang, Yongqi, Bai, Jionghao, Huang, Rongjie, Li, Ruiqi, Hong, Zhiqing, Zhao, Zhou

arXiv.org Artificial IntelligenceSep-14-2023

Direct speech-to-speech translation (S2ST) with discrete self-supervised representations has achieved remarkable accuracy, but is unable to preserve the speaker timbre of the source speech during translation. Meanwhile, the scarcity of high-quality speaker-parallel data poses a challenge for learning style transfer between source and target speech. We propose an S2ST framework with an acoustic language model based on discrete units from a self-supervised model and a neural codec for style transfer. The acoustic language model leverages self-supervised in-context learning, acquiring the ability for style transfer without relying on any speaker-parallel data, thereby overcoming the issue of data scarcity. By using extensive training data, our model achieves zero-shot cross-lingual style transfer on previously unseen source languages. Experiments show that our model generates translated speeches with high fidelity and style similarity. Audio samples are available at http://stylelm.github.io/ .

representation, speech, translation, (12 more...)

arXiv.org Artificial Intelligence

2309.07566

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.87)

Add feedback

Direct Text to Speech Translation System using Acoustic Units

Mingote, Victoria, Gimeno, Pablo, Vicente, Luis, Khurana, Sameer, Laurent, Antoine, Duret, Jarod

arXiv.org Artificial IntelligenceSep-14-2023

This paper proposes a direct text to speech translation system using discrete acoustic units. This framework employs text in different source languages as input to generate speech in the target language without the need for text transcriptions in this language. Motivated by the success of acoustic units in previous works for direct speech to speech translation systems, we use the same pipeline to extract the acoustic units using a speech encoder combined with a clustering algorithm. Once units are obtained, an encoder-decoder architecture is trained to predict them. Then a vocoder generates speech from units. Our approach for direct text to speech translation was tested on the new CVSS corpus with two different text mBART models employed as initialisation. The systems presented report competitive performance for most of the language pairs evaluated. Besides, results show a remarkable improvement when initialising our proposed architecture with a model pre-trained with more languages.

acoustic unit, speech, translation, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LSP.2023.3313513

2309.07478

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > France (0.04)
Europe > Spain > Aragón > Zaragoza Province > Zaragoza (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback

Glancing Future for Simultaneous Machine Translation

Guo, Shoutao, Zhang, Shaolei, Feng, Yang

arXiv.org Artificial IntelligenceSep-12-2023

Simultaneous machine translation (SiMT) outputs translation while reading the source sentence. Unlike conventional sequence-to-sequence (seq2seq) training, existing SiMT methods adopt the prefix-to-prefix (prefix2prefix) training, where the model predicts target tokens based on partial source tokens. However, the prefix2prefix training diminishes the ability of the model to capture global information and introduces forced predictions due to the absence of essential source information. Consequently, it is crucial to bridge the gap between the prefix2prefix training and seq2seq training to enhance the translation capability of the SiMT model. In this paper, we propose a novel method that glances future in curriculum learning to achieve the transition from the seq2seq training to prefix2prefix training. Specifically, we gradually reduce the available source information from the whole sentence to the prefix corresponding to that latency. Our method is applicable to a wide range of SiMT methods and experiments demonstrate that our method outperforms strong baselines.

artificial intelligence, natural language, translation, (17 more...)

arXiv.org Artificial Intelligence

2309.06179

Country: Asia > China (0.14)

Genre: Research Report (0.70)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (0.47)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.47)
Energy > Oil & Gas > Midstream (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Benchmarking Procedural Language Understanding for Low-Resource Languages: A Case Study on Turkish

Uzunoğlu, Arda, Şahin, Gözde Gül

arXiv.org Artificial IntelligenceSep-12-2023

Understanding procedural natural language (e.g., step-by-step instructions) is a crucial step to execution and planning. However, while there are ample corpora and downstream tasks available in English, the field lacks such resources for most languages. To address this gap, we conduct a case study on Turkish procedural texts. We first expand the number of tutorials in Turkish wikiHow from 2,000 to 52,000 using automated translation tools, where the translation quality and loyalty to the original meaning are validated by a team of experts on a random set. Then, we generate several downstream tasks on the corpus, such as linking actions, goal inference, and summarization. To tackle these tasks, we implement strong baseline models via fine-tuning large language-specific models such as TR-BART and BERTurk, as well as multilingual models such as mBART, mT5, and XLM. We find that language-specific models consistently outperform their multilingual models by a significant margin across most procedural language understanding (PLU) tasks. We release our corpus, downstream tasks and the baseline models with https://github.com/ GGLAB-KU/turkish-plu.

case study, low-resource language

arXiv.org Artificial Intelligence

2309.06698

Genre:

Workflow (0.53)
Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.87)

Add feedback

Grounded Language Acquisition From Object and Action Imagery

Kubricht, James Robert, Yang, Zhaoyuan, Qiu, Jianwei, Tu, Peter Henry

arXiv.org Artificial IntelligenceSep-12-2023

Deep learning approaches to natural language processing have made great strides in recent years. While these models produce symbols that convey vast amounts of diverse knowledge, it is unclear how such symbols are grounded in data from the world. In this paper, we explore the development of a private language for visual data representation by training emergent language (EL) encoders/decoders in both i) a traditional referential game environment and ii) a contrastive learning environment utilizing a within-class matching training paradigm. An additional classification layer utilizing neural machine translation and random forest classification was used to transform symbolic representations (sequences of integer symbols) to class labels. These methods were applied in two experiments focusing on object recognition and action recognition. For object recognition, a set of sketches produced by human participants from real imagery was used (Sketchy dataset) and for action recognition, 2D trajectories were generated from 3D motion capture systems (MOVI dataset). In order to interpret the symbols produced for data in each experiment, gradient-weighted class activation mapping (Grad-CAM) methods were used to identify pixel regions indicating semantic features which contribute evidence towards symbols in learned languages. Additionally, a t-distributed stochastic neighbor embedding (t-SNE) method was used to investigate embeddings learned by CNN feature extractors.

category, dataset, representation, (13 more...)

arXiv.org Artificial Intelligence

2309.06335

Country:

North America > United States (0.14)
South America > Peru > Lima Department > Lima Province > Lima (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.90)

Add feedback

A Survey of Hallucination in Large Foundation Models

Rawte, Vipula, Sheth, Amit, Das, Amitava

arXiv.org Artificial IntelligenceSep-11-2023

Hallucination in a foundation model (FM) refers to the generation of content that strays from factual reality or includes fabricated information. This survey paper provides an extensive overview of recent efforts that aim to identify, elucidate, and tackle the problem of hallucination, with a particular focus on ``Large'' Foundation Models (LFMs). The paper classifies various types of hallucination phenomena that are specific to LFMs and establishes evaluation criteria for assessing the extent of hallucination. It also examines existing strategies for mitigating hallucination in LFMs and discusses potential directions for future research in this area. Essentially, the paper offers a comprehensive examination of the challenges and solutions related to hallucination in LFMs.

arxiv preprint arxiv, hallucination, language model, (12 more...)

arXiv.org Artificial Intelligence

2309.05922

Country:

North America > United States > South Carolina (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)

Genre: Overview (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
(2 more...)

Add feedback

NeCo@ALQAC 2023: Legal Domain Knowledge Acquisition for Low-Resource Languages through Data Enrichment

Nguyen, Hai-Long, Nguyen, Dieu-Quynh, Nguyen, Hoang-Trung, Pham, Thu-Trang, Nguyen, Huu-Dong, Nguyen, Thach-Anh, Vuong, Thi-Hai-Yen, Nguyen, Ha-Thanh

arXiv.org Artificial IntelligenceSep-11-2023

In recent years, natural language processing has gained significant popularity in various sectors, including the legal domain. This paper presents NeCo Team's solutions to the Vietnamese text processing tasks provided in the Automated Legal Question Answering Competition 2023 (ALQAC 2023), focusing on legal domain knowledge acquisition for low-resource languages through data enrichment. Our methods for the legal document retrieval task employ a combination of similarity ranking and deep learning models, while for the second task, which requires extracting an answer from a relevant legal article in response to a question, we propose a range of adaptive techniques to handle different question types. Our approaches achieve outstanding results on both tasks of the competition, demonstrating the potential benefits and effectiveness of question answering systems in the legal field, particularly for low-resource languages.

competition, dataset, nguyen, (15 more...)

arXiv.org Artificial Intelligence

2309.055

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Switzerland (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)

Genre: Research Report (0.64)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

Machine Translation Models Stand Strong in the Face of Adversarial Attacks

Burnyshev, Pavel, Kostenok, Elizaveta, Zaytsev, Alexey

arXiv.org Artificial IntelligenceSep-10-2023

Adversarial attacks expose vulnerabilities of deep learning models by introducing minor perturbations to the input, which lead to substantial alterations in the output. Our research focuses on the impact of such adversarial attacks on sequence-to-sequence (seq2seq) models, specifically machine translation models. We introduce algorithms that incorporate basic text perturbation heuristics and more advanced strategies, such as the gradient-based attack, which utilizes a differentiable approximation of the inherently non-differentiable translation metric. Through our investigation, we provide evidence that machine translation models display robustness displayed robustness against best performed known adversarial attacks, as the degree of perturbation in the output is directly proportional to the perturbation in the input. However, among underdogs, our attacks outperform alternatives, providing the best relative performance. Another strong candidate is an attack based on mixing of individual characters.

adversarial attack, machine translation model stand

arXiv.org Artificial Intelligence

2309.06527

Genre: Research Report (0.40)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

The Effect of Alignment Objectives on Code-Switching Translation

Anwar, Mohamed

arXiv.org Artificial IntelligenceSep-10-2023

One of the things that need to change when it comes to machine translation is the models' ability to translate code-switching content, especially with the rise of social media and user-generated content. In this paper, we are proposing a way of training a single machine translation model that is able to translate monolingual sentences from one language to another, along with translating code-switched sentences to either language. This model can be considered a bilingual model in the human sense. For better use of parallel data, we generated synthetic code-switched (CSW) data along with an alignment loss on the encoder to align representations across languages. Using the WMT14 English-French (En-Fr) dataset, the trained model strongly outperforms bidirectional baselines on code-switched translation while maintaining quality for non-code-switched (monolingual) data.

arxiv preprint arxiv, code-switched data, translation, (14 more...)

arXiv.org Artificial Intelligence

2309.05044

Country:

Africa (0.05)
North America > United States > Tennessee (0.04)
North America > Canada > Ontario (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

Kudugunta, Sneha, Caswell, Isaac, Zhang, Biao, Garcia, Xavier, Choquette-Choo, Christopher A., Lee, Katherine, Xin, Derrick, Kusupati, Aditya, Stella, Romi, Bapna, Ankur, Firat, Orhan

arXiv.org Artificial IntelligenceSep-8-2023

We introduce MADLAD-400, a manually audited, general domain 3T token monolingual dataset based on CommonCrawl, spanning 419 languages. We discuss the limitations revealed by self-auditing MADLAD-400, and the role data auditing had in the dataset creation process. We then train and release a 10.7B-parameter multilingual machine translation model on 250 billion tokens covering over 450 languages using publicly available data, and find that it is competitive with models that are significantly larger, and report the results on different domains. In addition, we train a 8B-parameter language model, and assess the results on few-shot translation. We make the baseline models available to the research community.

dataset, latn, madlad-400, (15 more...)

arXiv.org Artificial Intelligence

2309.04662

Country:

Oceania > Tonga (0.04)
North America > United States (0.04)
Asia > Indonesia > Bali (0.04)
(30 more...)

Genre: Research Report (0.63)

Industry: Law (0.67)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Data Science > Data Quality (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback