AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Low-Resource Neural Machine Translation Using Recurrent Neural Networks and Transfer Learning: A Case Study on English-to-Igbo

Ekle, Ocheme Anthony, Das, Biswarup

arXiv.org Artificial IntelligenceApr-25-2025

In this study, we develop Neural Machine Translation (NMT) and Transformer-based transfer learning models for English-to-Igbo translation - a low-resource African language spoken by over 40 million people across Nigeria and West Africa. Our models are trained on a curated and benchmarked dataset compiled from Bible corpora, local news, Wikipedia articles, and Common Crawl, all verified by native language experts. We leverage Recurrent Neural Network (RNN) architectures, including Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), enhanced with attention mechanisms to improve translation accuracy. To further enhance performance, we apply transfer learning using MarianNMT pre-trained models within the SimpleTransformers framework. Our RNN-based system achieves competitive results, closely matching existing English-Igbo benchmarks. With transfer learning, we observe a performance gain of +4.83 BLEU points, reaching an estimated translation accuracy of 70%. These findings highlight the effectiveness of combining RNNs with transfer learning to address the performance gap in low-resource language translation tasks.

machine learning, natural language, translation, (12 more...)

arXiv.org Artificial Intelligence

2504.17252

Country:

Africa > Nigeria (0.88)
North America > United States (0.68)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Law (0.67)
Government > Regional Government > Africa Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Comparing Large Language Models and Traditional Machine Translation Tools for Translating Medical Consultation Summaries: A Pilot Study

Li, Andy, Zhou, Wei, Hoda, Rashina, Bain, Chris, Poon, Peter

arXiv.org Artificial IntelligenceApr-24-2025

This study evaluates how well large language models (LLMs) and traditional machine translation (MT) tools translate medical consultation summaries from English into Arabic, Chinese, and Vietnamese. It assesses both patient, friendly and clinician, focused texts using standard automated metrics. Results showed that traditional MT tools generally performed better, especially for complex texts, while LLMs showed promise, particularly in Vietnamese and Chinese, when translating simpler summaries. Arabic translations improved with complexity due to the language's morphology. Overall, while LLMs offer contextual flexibility, they remain inconsistent, and current evaluation metrics fail to capture clinical relevance. The study highlights the need for domain-specific training, improved evaluation methods, and human oversight in medical translation.

large language model, machine learning, translation, (18 more...)

arXiv.org Artificial Intelligence

2504.16601

Country:

Europe (0.46)
Oceania > Australia (0.30)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Hematology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Using Phonemes in cascaded S2S translation pipeline

Pilz, Rene, Schneider, Johannes

arXiv.org Artificial IntelligenceApr-24-2025

This paper explores the idea of using phonemes as a textual representation within a conventional multilingual simultaneous speech - to - speech translation pipeline, as opposed to the traditional reliance on text - based language representations. To investigate this, we trained an open - source sequence - to - sequence model on the WMT17 dataset in two formats: one using standard textual representation and the other employing phonemic representation. The performance o f both approaches was assessed using the BLEU metric. Our findings shows that the phonemic approach provides comparable quality but offers several advantages, including lower resource requirements or better suitability for low - resource languages.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2504.16234

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks

Wu, Minghao, Wang, Weixuan, Liu, Sinuo, Yin, Huifeng, Wang, Xintong, Zhao, Yu, Lyu, Chenyang, Wang, Longyue, Luo, Weihua, Zhang, Kaifu

arXiv.org Artificial IntelligenceApr-23-2025

As large language models (LLMs) continue to advance in linguistic capabilities, robust multilingual evaluation has become essential for promoting equitable technological progress. This position paper examines over 2,000 multilingual (non-English) benchmarks from 148 countries, published between 2021 and 2024, to evaluate past, present, and future practices in multilingual benchmarking. Our findings reveal that, despite significant investments amounting to tens of millions of dollars, English remains significantly overrepresented in these benchmarks. Additionally, most benchmarks rely on original language content rather than translations, with the majority sourced from high-resource countries such as China, India, Germany, the UK, and the USA. Furthermore, a comparison of benchmark performance with human judgments highlights notable disparities. STEM-related tasks exhibit strong correlations with human evaluations (0.70 to 0.85), while traditional NLP tasks like question answering (e.g., XQuAD) show much weaker correlations (0.11 to 0.30). Moreover, translating English benchmarks into other languages proves insufficient, as localized benchmarks demonstrate significantly higher alignment with local human judgments (0.68) than their translated counterparts (0.47). This underscores the importance of creating culturally and linguistically tailored benchmarks rather than relying solely on translations. Through this comprehensive analysis, we highlight six key limitations in current multilingual evaluation practices, propose the guiding principles accordingly for effective multilingual benchmarking, and outline five critical research directions to drive progress in the field. Finally, we call for a global collaborative effort to develop human-aligned benchmarks that prioritize real-world applications.

benchmark, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2504.15521

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.49)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation

Deng, Keqi, Chen, Wenxi, Chen, Xie, Woodland, Philip C.

arXiv.org Artificial IntelligenceApr-23-2025

Simultaneous speech translation (SST) outputs translations in parallel with streaming speech input, balancing translation quality and latency. While large language models (LLMs) have been extended to handle the speech modality, streaming remains challenging as speech is prepended as a prompt for the entire generation process. To unlock LLM streaming capability, this paper proposes SimulS2S-LLM, which trains speech LLMs offline and employs a test-time policy to guide simultaneous inference. SimulS2S-LLM alleviates the mismatch between training and inference by extracting boundary-aware speech prompts that allows it to be better matched with text input data. SimulS2S-LLM achieves simultaneous speech-to-speech translation (Simul-S2ST) by predicting discrete output speech tokens and then synthesising output speech using a pre-trained vocoder. An incremental beam search is designed to expand the search space of speech token prediction without increasing latency. Experiments on the CVSS speech data show that SimulS2S-LLM offers a better translation quality-latency trade-off than existing methods that use the same training data, such as improving ASR-BLEU scores by 3 points at similar latency.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2504.15509

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Regional Tiny Stories: Using Small Models to Compare Language Learning and Tokenizer Performance

Patil, Nirvan, Inamdar, Malhar Abhay, Gosai, Agnivo, Pathak, Guruprasad, Joshi, Anish, Sagavekar, Aryan, Joshirao, Anish, Dandekar, Raj, Dandekar, Rajat, Panat, Sreedath

arXiv.org Artificial IntelligenceApr-23-2025

The 2023 TinyStories study developed an English dataset that allows Small Language Models (SLMs) with 1-10 million parameters to produce coherent outputs matching those of LLMs. Our research expands this framework by creating translated as well as synthetically generated datasets in Indian languages. Using this new dataset, we demonstrate that SLMs efficiently process regional languages with significantly fewer parameters than LLMs, and additionally offer a complementary framework for "inference-based evaluation" of tokenization strategies and linguistic complexity. Our analysis reveals that language-specific tokenizers outperform general-purpose ones for Indian languages. Empirical validations, supported by information-theoretic and morphological analyses, provide insights into the superior performance of Hindi models over Marathi and Bengali. The study uncovers distinct cross-linguistic patterns: Bengali emphasizes creativity, Hindi excels in context understanding and grammar with model scaling, and Marathi requires larger models to capture its unique linguistic features. Optimal parameter allocation varies, with Hindi benefiting more from wider architectures and Bengali favoring a balanced approach. We also show that quality synthetic datasets outperform translated content for training SLMs by 15-30 % . These findings advance both the practical application of SLMs to underserved languages and our theoretical understanding of neural language development.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.07989

Country:

Europe (1.00)
North America > United States (0.67)

Genre: Research Report > New Finding (0.93)

Industry: Education > Curriculum > Subject-Specific Education (0.42)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)

Add feedback

Automatic Evaluation Metrics for Document-level Translation: Overview, Challenges and Trends

GUO, Jiaxin, Chen, Xiaoyu, Rao, Zhiqiang, Yang, Jinlong, Li, Zongyao, Shang, Hengchao, Wei, Daimeng, Yang, Hao

arXiv.org Artificial IntelligenceApr-22-2025

With the rapid development of deep learning technologies, the field of machine translation has witnessed significant progress, especially with the advent of large language models (LLMs) that have greatly propelled the advancement of document-level translation. However, accurately evaluating the quality of document-level translation remains an urgent issue. This paper first introduces the development status of document-level translation and the importance of evaluation, highlighting the crucial role of automatic evaluation metrics in reflecting translation quality and guiding the improvement of translation systems. It then provides a detailed analysis of the current state of automatic evaluation schemes and metrics, including evaluation methods with and without reference texts, as well as traditional metrics, Model-based metrics and LLM-based metrics. Subsequently, the paper explores the challenges faced by current evaluation methods, such as the lack of reference diversity, dependence on sentence-level alignment information, and the bias, inaccuracy, and lack of interpretability of the LLM-as-a-judge method. Finally, the paper looks ahead to the future trends in evaluation methods, including the development of more user-friendly document-level evaluation methods and more robust LLM-as-a-judge methods, and proposes possible research directions, such as reducing the dependency on sentence-level information, introducing multi-level and multi-granular evaluation approaches, and training models specifically for machine translation evaluation. This study aims to provide a comprehensive analysis of automatic evaluation for document-level translation and offer insights into future developments.

large language model, machine learning, translation, (18 more...)

arXiv.org Artificial Intelligence

2504.14804

Country:

Asia > China (0.29)
North America > United States > California > Los Angeles County (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PEFT A2Z: Parameter-Efficient Fine-Tuning Survey for Large Language and Vision Models

Prottasha, Nusrat Jahan, Chowdhury, Upama Roy, Mohanto, Shetu, Nuzhat, Tasfia, Sami, Abdullah As, Ali, Md Shamol, Sobuj, Md Shohanur Islam, Raman, Hafijur, Kowsher, Md, Garibay, Ozlem Ozmen

arXiv.org Artificial IntelligenceApr-22-2025

Large models such as Large Language Models (LLMs) and Vision Language Models (VLMs) have transformed artificial intelligence, powering applications in natural language processing, computer vision, and multimodal learning. However, fully fine-tuning these models remains expensive, requiring extensive computational resources, memory, and task-specific data. Parameter-Efficient Fine-Tuning (PEFT) has emerged as a promising solution that allows adapting large models to downstream tasks by updating only a small portion of parameters. This survey presents a comprehensive overview of PEFT techniques, focusing on their motivations, design principles, and effectiveness. We begin by analyzing the resource and accessibility challenges posed by traditional fine-tuning and highlight key issues, such as overfitting, catastrophic forgetting, and parameter inefficiency. We then introduce a structured taxonomy of PEFT methods -- grouped into additive, selective, reparameterized, hybrid, and unified frameworks -- and systematically compare their mechanisms and trade-offs. Beyond taxonomy, we explore the impact of PEFT across diverse domains, including language, vision, and generative modeling, showing how these techniques offer strong performance with lower resource costs. We also discuss important open challenges in scalability, interpretability, and robustness, and suggest future directions such as federated learning, domain adaptation, and theoretical grounding. Our goal is to provide a unified understanding of PEFT and its growing role in enabling practical, efficient, and sustainable use of large models.

large language model, machine learning, pattern recognition, (25 more...)

arXiv.org Artificial Intelligence

2504.14117

Country:

Europe (1.00)
Asia (1.00)
North America > Canada (0.67)
North America > United States > Minnesota (0.28)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.65)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
Energy (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
(7 more...)

Add feedback

Amplify Initiative: Building A Localized Data Platform for Globalized AI

Rashid, Qazi Mamunur, van Liemt, Erin, Shih, Tiffany, Ebinama, Amber, Ramos, Karla Barrios, Maji, Madhurima, Verma, Aishwarya, Kalia, Charu, Smith-Loud, Jamila, Nakatumba-Nabende, Joyce, Baguma, Rehema, Katumba, Andrew, Mutebi, Chodrine, Marvin, Jagen, Wairagala, Eric Peter, Bruce, Mugizi, Oketta, Peter, Nderu, Lawrence, Obiajunwa, Obichi, Oppong, Abigail, Zimba, Michael, Authors, Data

arXiv.org Artificial IntelligenceApr-22-2025

Current AI models often fail to account for local context and language, given the predominance of English and Western internet content in their training data. This hinders the global relevance, usefulness, and safety of these models as they gain more users around the globe. Amplify Initiative, a data platform and methodology, leverages expert communities to collect diverse, high-quality data to address the limitations of these models. The platform is designed to enable co-creation of datasets, provide access to high-quality multilingual datasets, and offer recognition to data authors. This paper presents the approach to co-creating datasets with domain experts (e.g., health workers, teachers) through a pilot conducted in Sub-Saharan Africa (Ghana, Kenya, Malawi, Nigeria, and Uganda). In partnership with local researchers situated in these countries, the pilot demonstrated an end-to-end approach to co-creating data with 155 experts in sensitive domains (e.g., physicians, bankers, anthropologists, human and civil rights advocates). This approach, implemented with an Android app, resulted in an annotated dataset of 8,091 adversarial queries in seven languages (e.g., Luganda, Swahili, Chichewa), capturing nuanced and contextual information related to key themes such as misinformation and public interest topics. This dataset in turn can be used to evaluate models for their safety and cultural relevance within the context of these languages.

data mining, large language model, machine learning, (24 more...)

arXiv.org Artificial Intelligence

2504.14105

Country:

Europe > Spain (0.28)
Africa > Uganda (0.27)
Africa > Malawi (0.26)
(4 more...)

Genre:

Instructional Material (0.93)
Research Report (0.64)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Data Science > Data Mining > Big Data (0.70)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.47)
(3 more...)

Add feedback

Remedy: Learning Machine Translation Evaluation from Human Preferences with Reward Modeling

Tan, Shaomu, Monz, Christof

arXiv.org Artificial IntelligenceApr-21-2025

A key challenge in MT evaluation is the inherent noise and inconsistency of human ratings. Regression-based neural metrics struggle with this noise, while prompting LLMs shows promise at system-level evaluation but performs poorly at segment level. In this work, we propose ReMedy, a novel MT metric framework that reformulates translation evaluation as a reward modeling task. Instead of regressing on imperfect human ratings directly, ReMedy learns relative translation quality using pairwise preference data, resulting in a more reliable evaluation. In extensive experiments across WMT22-24 shared tasks (39 language pairs, 111 MT systems), ReMedy achieves state-of-the-art performance at both segment- and system-level evaluation. Specifically, ReMedy-9B surpasses larger WMT winners and massive closed LLMs such as MetricX-13B, XCOMET-Ensemble, GEMBA-GPT-4, PaLM-540B, and finetuned PaLM2. Further analyses demonstrate that ReMedy delivers superior capability in detecting translation errors and evaluating low-quality translations.

large language model, machine learning, translation, (16 more...)

arXiv.org Artificial Intelligence

2504.1363

Country:

North America > United States (0.46)
Europe (0.46)
Asia (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback