AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Warmup Generations: A Task-Agnostic Approach for Guiding Sequence-to-Sequence Learning with Unsupervised Initial State Generation

Li, Senyu, Sun, Zipeng, Wang, Jiayi, Liu, Xue, Stenetorp, Pontus, Reddy, Siva, Adelani, David Ifeoluwa

arXiv.org Artificial IntelligenceFeb-17-2025

Traditional supervised fine-tuning (SFT) strategies for sequence-to-sequence tasks often train models to directly generate the target output. Recent work has shown that guiding models with intermediate steps, such as keywords, outlines, or reasoning chains, can significantly improve performance, coherence, and interpretability. However, these methods often depend on predefined intermediate formats and annotated data, limiting their scalability and generalizability. In this work, we introduce a task-agnostic framework that enables models to generate intermediate "warmup" sequences. These warmup sequences, serving as an initial state for subsequent generation, are optimized to enhance the probability of generating the target sequence without relying on external supervision or human-designed structures. Drawing inspiration from reinforcement learning principles, our method iteratively refines these intermediate steps to maximize their contribution to the final output, similar to reward-driven optimization in reinforcement learning with human feedback. Experimental results across tasks such as translation, summarization, and multi-choice question answering for logical reasoning show that our approach outperforms traditional SFT methods, and offers a scalable and flexible solution for sequence-to-sequence tasks.

machine learning, natural language, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2502.12304

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.54)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.47)

Add feedback

WMT24++: Expanding the Language Coverage of WMT24 to 55 Languages & Dialects

Deutsch, Daniel, Briakou, Eleftheria, Caswell, Isaac, Finkelstein, Mara, Galor, Rebecca, Juraska, Juraj, Kovacs, Geza, Lui, Alison, Rei, Ricardo, Riesa, Jason, Rijhwani, Shruti, Riley, Parker, Salesky, Elizabeth, Trabelsi, Firas, Winkler, Stephanie, Zhang, Biao, Freitag, Markus

arXiv.org Artificial IntelligenceFeb-17-2025

As large language models (LLM) become more and more capable in languages other than English, it is important to collect benchmark datasets in order to evaluate their multilingual performance, including on tasks like machine translation (MT). In this work, we extend the WMT24 dataset to cover 55 languages by collecting new human-written references and post-edits for 46 new languages and dialects in addition to post-edits of the references in 8 out of 9 languages in the original WMT24 dataset. The dataset covers four domains: literary, news, social, and speech. We benchmark a variety of MT providers and LLMs on the collected dataset using automatic metrics and find that LLMs are the best-performing MT systems in all 55 languages. These results should be confirmed using a human-based evaluation, which we leave for future work.

large language model, natural language, translation, (16 more...)

arXiv.org Artificial Intelligence

2502.12404

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.93)
Africa (0.68)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Sports (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Quality-Aware Decoding: Unifying Quality Estimation and Decoding

Koneru, Sai, Huck, Matthias, Exel, Miriam, Niehues, Jan

arXiv.org Artificial IntelligenceFeb-16-2025

Quality Estimation (QE) models for Neural Machine Translation (NMT) predict the quality of the hypothesis without having access to the reference. An emerging research direction in NMT involves the use of QE models, which have demonstrated high correlations with human judgment and can enhance translations through Quality-Aware Decoding. Although several approaches have been proposed based on sampling multiple candidate translations and picking the best candidate, none have integrated these models directly into the decoding process. In this paper, we address this by proposing a novel token-level QE model capable of reliably scoring partial translations. We build a uni-directional QE model for this, as decoder models are inherently trained and efficient on partial sequences. We then present a decoding strategy that integrates the QE model for Quality-Aware decoding and demonstrate that the translation quality improves when compared to the N-best list re-ranking with state-of-the-art QE models (up to $1.39$ XCOMET-XXL $\uparrow$). Finally, we show that our approach provides significant benefits in document translation tasks, where the quality of N-best lists is typically suboptimal. Code can be found at https://ai4lt.iar.kit.edu/english/projects\_kontextmt.php

artificial intelligence, natural language, qe model, (17 more...)

arXiv.org Artificial Intelligence

2502.08561

Country: North America > Mexico (0.28)

Genre: Research Report (0.64)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Are Generative Models Underconfident? An Embarrassingly Simple Quality Estimation Approach

Dinh, Tu Anh, Niehues, Jan

arXiv.org Artificial IntelligenceFeb-16-2025

Quality Estimation (QE) is estimating the quality of model output when the ground truth reference is not available. Looking at model uncertainty from its own output probabilities is the most trivial and low-effort way to estimate the output quality. However, for generative model, output probabilities might not be the best quality estimator. At an output step, there can be multiple correct options, making the probability distribution spread out more. Thus, lower token probability does not necessarily mean lower output quality. In other words, the model can be considered underconfident. In this paper, we propose a QE approach called Dominant Mass Probability (DMP}, that boosts the model confidence in cases where there are multiple viable output options. We show that, with no increase in complexity, DMP is notably better than sequence probability when estimating the quality of different models (Whisper, Llama, etc.) on different tasks (translation, summarization, etc.). Compared to sequence probability, DMP achieves on average +0.208 improvement in Pearson correlation to ground-truth quality.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.11115

Country:

Europe (1.00)
Asia > Middle East > UAE (0.14)
North America > Canada (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.84)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback

Asymmetric Conflict and Synergy in Post-training for LLM-based Multilingual Machine Translation

Zheng, Tong, Wen, Yan, Bao, Huiwen, Guo, Junfeng, Huang, Heng

arXiv.org Artificial IntelligenceFeb-16-2025

The emergence of Large Language Models (LLMs) has advanced the multilingual machine translation (MMT), yet the Curse of Multilinguality (CoM) remains a major challenge. Existing work in LLM-based MMT typically mitigates this issue via scaling up training and computation budget, which raises a critical question: Is scaling up the training and computation budget truly necessary for high-quality MMT, or can a deeper understanding of CoM provide a more efficient solution? To explore this problem, we analyze the linguistic conflicts and synergy, the underlying mechanism of CoM during post-training phase. We identify an asymmetric phenomenon in linguistic conflicts and synergy: the dominance of conflicts and synergy varies in different translation directions, leading to sub-optimal adaptation in existing post-training methods. We further find that a significant bottleneck in MMT appears to lie in post-training rather than multilingual pre-training, suggesting the need for more effective adaptation strategies. Building on these new insights, we propose a direction-aware training approach, combined with group-wise model merging, to address asymmetry in linguistic conflicts and synergy explicitly. Leveraging this strategy, our method fine-tunes X-ALMA-13B-Pretrain-trained only with multilingual pre-training-achieving comparable performance to XALMA-13B (only SFT) while using only 20B pretraining tokens and 17B parameters-5.5x fewer pretraining-tokens and 1.7x fewer model size-with just 0.85 COMET drop on Flores-200 testsets of 50 languages.

large language model, multilingual training, natural language, (12 more...)

arXiv.org Artificial Intelligence

2502.11223

Country:

Asia (0.93)
Europe (0.92)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Multilingual Encoder Knows more than You Realize: Shared Weights Pretraining for Extremely Low-Resource Languages

Su, Zeli, Zhang, Ziyin, Xu, Guixian, Liu, Jianing, Han, XU, Zhang, Ting, Dong, Yushuang

arXiv.org Artificial IntelligenceFeb-15-2025

While multilingual language models like XLM-R have advanced multilingualism in NLP, they still perform poorly in extremely low-resource languages. This situation is exacerbated by the fact that modern LLMs such as LLaMA and Qwen support far fewer languages than XLM-R, making text generation models non-existent for many languages in the world. To tackle this challenge, we propose a novel framework for adapting multilingual encoders to text generation in extremely low-resource languages. By reusing the weights between the encoder and the decoder, our framework allows the model to leverage the learned semantic space of the encoder, enabling efficient learning and effective generalization in low-resource languages. Applying this framework to four Chinese minority languages, we present XLM-SWCM, and demonstrate its superior performance on various downstream tasks even when compared with much larger models.

artificial intelligence, machine translation, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.10852

Country:

Europe (1.00)
North America (0.93)
Asia > China (0.48)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)

Add feedback

Salamandra Technical Report

Gonzalez-Agirre, Aitor, Pàmies, Marc, Llop, Joan, Baucells, Irene, Da Dalt, Severino, Tamayo, Daniel, Saiz, José Javier, Espuña, Ferran, Prats, Jaume, Aula-Blasco, Javier, Mina, Mario, Pikabea, Iñigo, Rubio, Adrián, Shvets, Alexander, Sallés, Anna, Lacunza, Iñaki, Palomar, Jorge, Falcão, Júlia, Tormo, Lucía, Vasquez-Reina, Luis, Marimon, Montserrat, Pareras, Oriol, Ruiz-Fernández, Valle, Villegas, Marta

arXiv.org Artificial IntelligenceFeb-13-2025

This work introduces Salamandra, a suite of open-source decoder-only large language models available in three different sizes: 2, 7, and 40 billion parameters. The models were trained from scratch on highly multilingual data that comprises text in 35 European languages and code. Our carefully curated corpus is made exclusively from open-access data compiled from a wide variety of sources. Along with the base models, supplementary checkpoints that were fine-tuned on public-domain instruction data are also released for chat applications. Additionally, we also share our preliminary experiments on multimodality, which serve as proof-of-concept to showcase potential applications for the Salamandra family. Our extensive evaluations on multilingual benchmarks reveal that Salamandra has strong capabilities, achieving competitive performance when compared to similarly sized open-source models. We provide comprehensive evaluation results both on standard downstream tasks as well as key aspects related to bias and safety.With this technical report, we intend to promote open science by sharing all the details behind our design choices, data curation strategy and evaluation methodology. In addition to that, we deviate from the usual practice by making our training and evaluation scripts publicly accessible. We release all models under a permissive Apache 2.0 license in order to foster future research and facilitate commercial use, thereby contributing to the open-source ecosystem of large language models.

kideak modu eraginkorragoan aurkitzen zituzten, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2502.08489

Country:

Europe > Spain (1.00)
North America > United States (0.92)
Asia > Middle East > UAE (0.45)

Genre: Research Report > New Finding (0.92)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(4 more...)

Technology:

Information Technology > Software (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
(3 more...)

Add feedback

Theoretical Benefit and Limitation of Diffusion Language Model

Feng, Guhao, Geng, Yihan, Guan, Jian, Wu, Wei, Wang, Liwei, He, Di

arXiv.org Machine LearningFeb-13-2025

Diffusion language models have emerged as a promising approach for text generation. One would naturally expect this method to be an efficient replacement for autoregressive models since multiple tokens can be sampled in parallel during each diffusion step. However, its efficiency-accuracy trade-off is not yet well understood. In this paper, we present a rigorous theoretical analysis of a widely used type of diffusion language model, the Masked Diffusion Model (MDM), and find that its effectiveness heavily depends on the target evaluation metric. Under mild conditions, we prove that when using perplexity as the metric, MDMs can achieve near-optimal perplexity in sampling steps regardless of sequence length, demonstrating that efficiency can be achieved without sacrificing performance. However, when using the sequence error rate--which is important for understanding the "correctness" of a sequence, such as a reasoning chain--we show that the required sampling steps must scale linearly with sequence length to obtain "correct" sequences, thereby eliminating MDM's efficiency advantage over autoregressive models. Our analysis establishes the first theoretical foundation for understanding the benefits and limitations of MDMs. All theoretical findings are supported by empirical studies.

large language model, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2502.09622

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Are Expressions for Music Emotions the Same Across Cultures?

Celen, Elif, van Rijn, Pol, Lee, Harin, Jacoby, Nori

arXiv.org Artificial IntelligenceFeb-12-2025

Music evokes profound emotions, yet the universality of emotional descriptors across languages remains debated. A key challenge in cross-cultural research on music emotion is biased stimulus selection and manual curation of taxonomies, predominantly relying on Western music and languages. To address this, we propose a balanced experimental design with nine online experiments in Brazil, the US, and South Korea, involving N=672 participants. First, we sample a balanced set of popular music from these countries. Using an open-ended tagging pipeline, we then gather emotion terms to create culture-specific taxonomies. Finally, using these bottom-up taxonomies, participants rate emotions of each song. This allows us to map emotional similarities within and across cultures. Results show consistency in high arousal, high valence emotions but greater variability in others. Notably, machine translations were often inadequate to capture music-specific meanings. These findings together highlight the need for a domain-sensitive, open-ended, bottom-up emotion elicitation approach to reduce cultural biases in emotion research.

artificial intelligence, machine translation, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.08744

Country:

South America > Brazil (0.28)
Asia > South Korea (0.24)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
(7 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.34)

Add feedback

Top translation apps for travelers

FOX NewsFeb-11-2025, 11:00:53 GMT

Navi allows you to understand anyone, anywhere, anytime. Traveling abroad can be an incredible adventure, but the language barrier can make even the most intrepid explorer nervous. The fear of being unable to communicate or understand signs and menus keeps many would-be travelers from ever leaving their home country. With the rise of powerful translation apps, you can harness the latest technology right on your smartphone to bridge the language gap wherever your journey takes you. I've tested out the top contenders to bring you my picks for the best translation apps for global travelers.

artificial intelligence, natural language, translation app, (18 more...)

FOX News

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.61)

Add feedback