AITopics

2505.13908

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.95)

arXiv.org Artificial IntelligenceMay-21-2025

Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation

Wu, Zhanglin, Wei, Daimeng, Chen, Xiaoyu, Shang, Hengchao, Guo, Jiaxin, Li, Zongyao, Luo, Yuanchang, Yang, Jinlong, Rao, Zhiqiang, Yang, Hao

Large language model (LLM) shows promising performances in a variety of downstream tasks, such as machine translation (MT). However, using LLMs for translation suffers from high computational costs and significant latency. Based on our evaluation, in most cases, translations using LLMs are comparable to that generated by neural machine translation (NMT) systems. Only in particular scenarios, LLM and NMT models show respective advantages. As a result, integrating NMT and LLM for translation and using LLM only when necessary seems to be a sound solution. A scheduling policy that optimizes translation result while ensuring fast speed and as little LLM usage as possible is thereby required. We compare several scheduling policies and propose a novel and straightforward decider that leverages source sentence features. We conduct extensive experiments on multilingual test sets and the result shows that we can achieve optimal translation performance with minimal LLM usage, demonstrating effectiveness of our decider.

large language model, natural language, translation, (18 more...)

2505.13554

Country: Asia (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Rahman, Md. Atiqur, Islam, Sabrina, Omi, Mushfiqul Haque

LLM-Based Evaluation of Low-Resource Machine Translation: A Reference-less Dialect Guided Approach with a Refined Sylheti-English Benchmark

Evaluating machine translation (MT) for low - resource languages poses a persistent challenge, primarily due to the limited availability of high - quality reference translations. This issue is further exacerbated in languages with multiple dialects, where linguistic diversity and data scarcity hinder robust evaluation. Large Language Models (LLMs) present a promising solution through reference - free evaluation techniques; however, their effectiveness diminishes in the absence of dialect - specific context and tailored guidance. In this work, we propose a comprehensive framework that enhances LLM - based MT evaluation using a dialect guided approach. We extend the ONUBAD dataset by incorporating Sylheti - English sentence pairs, corresponding machine - translations, and Direct Assessment (DA) scores annotated by native speakers. To address the vocabulary gap, we augment the tokenizer vocabulary with dialect - specific terms. We further introduce a regression head to enable scalar score prediction and design a dialect - guided (DG) prompting strategy. Our evaluation across multiple LLMs shows that the proposed pipeline consistently outperforms existing methods, achieving the highest gain of +0.1083 in Spear-man correlation, along with improvements across other evaluation settings. The dataset and the code are available at https://github.com/180041123 - Atiq/MTEonLowResourceLanguage .

artificial intelligence, large language model, natural language, (14 more...)

2505.12273

Country:

Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.05)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

KIT's Offline Speech Translation and Instruction Following Submission for IWSLT 2025

Koneru, Sai, Züfle, Maike, Nguyen, Thai-Binh, Akti, Seymanur, Niehues, Jan, Waibel, Alexander

The scope of the International Workshop on Spoken Language Translation (IWSLT) has recently broadened beyond traditional Speech Translation (ST) to encompass a wider array of tasks, including Speech Question Answering and Summarization. This shift is partly driven by the growing capabilities of modern systems, particularly with the success of Large Language Models (LLMs). In this paper, we present the Karlsruhe Institute of Technology's submissions for the Offline ST and Instruction Following (IF) tracks, where we leverage LLMs to enhance performance across all tasks. For the Offline ST track, we propose a pipeline that employs multiple automatic speech recognition systems, whose outputs are fused using an LLM with document-level context. This is followed by a two-step translation process, incorporating additional refinement step to improve translation quality. For the IF track, we develop an end-to-end model that integrates a speech encoder with an LLM to perform a wide range of instruction-following tasks. We complement it with a final document-level refinement stage to further enhance output quality by using contextual information.

large language model, natural language, translation, (18 more...)

2505.13036

Country:

Asia > Thailand (0.28)
North America > Mexico (0.28)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.24)

Genre: Research Report > New Finding (0.93)

Industry:

Government (0.68)
Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

ExTrans: Multilingual Deep Reasoning Translation via Exemplar-Enhanced Reinforcement Learning

Wang, Jiaan, Meng, Fandong, Zhou, Jie

In recent years, the emergence of large reasoning models (LRMs), such as OpenAI-o1 and DeepSeek-R1, has shown impressive capabilities in complex problems, e.g., mathematics and coding. Some pioneering studies attempt to bring the success of LRMs in neural machine translation (MT). They try to build LRMs with deep reasoning MT ability via reinforcement learning (RL). Despite some progress that has been made, these attempts generally focus on several high-resource languages, e.g., English and Chinese, leaving the performance on other languages unclear. Besides, the reward modeling methods in previous work do not fully unleash the potential of reinforcement learning in MT. In this work, we first design a new reward modeling method that compares the translation results of the policy MT model with a strong LRM (i.e., DeepSeek-R1-671B), and quantifies the comparisons to provide rewards. Experimental results demonstrate the superiority of the reward modeling method. Using Qwen2.5-7B-Instruct as the backbone, the trained model achieves the new state-of-the-art performance in literary translation, and outperforms strong LRMs including OpenAI-o1 and DeepSeeK-R1. Furthermore, we extend our method to the multilingual settings with 11 languages. With a carefully designed lightweight reward modeling in RL, we can simply transfer the strong MT ability from a single direction into multiple (i.e., 90) translation directions and achieve impressive multilingual MT performance.

large language model, machine learning, translation, (19 more...)

2505.12996

Country:

Asia (0.68)
North America > Mexico (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.45)

Trans-Zero: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data

Zou, Wei, Yang, Sen, Bao, Yu, Huang, Shujian, Chen, Jiajun, Cheng, Shanbo

The rise of Large Language Models (LLMs) has reshaped machine translation (MT), but multilingual MT still relies heavily on parallel data for supervised fine-tuning (SFT), facing challenges like data scarcity for low-resource languages and catastrophic forgetting. To address these issues, we propose TRANS-ZERO, a self-play framework that leverages only monolingual data and the intrinsic multilingual knowledge of LLM. TRANS-ZERO combines Genetic Monte-Carlo Tree Search (G-MCTS) with preference optimization, achieving strong translation performance that rivals supervised methods. Experiments demonstrate that this approach not only matches the performance of models trained on large-scale parallel data but also excels in non-English translation directions. Further analysis reveals that G-MCTS itself significantly enhances translation quality by exploring semantically consistent candidates through iterative translations, providing a robust foundation for the framework's succuss.

large language model, machine learning, translation, (18 more...)

2504.14669

Country:

Asia (1.00)
North America > Mexico > Mexico City (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Evaluating Menu OCR and Translation: A Benchmark for Aligning Human and Automated Evaluations in Large Vision-Language Models

Wu, Zhanglin, Song, Tengfei, Xie, Ning, Zhu, Mengli, Zhang, Weidong, Wu, Shuang, Li, Pengfei, Li, Chong, Zhu, Junhao, Yang, Hao, Sun, Shiliang

The rapid advancement of large vision-language models (LVLMs) has significantly propelled applications in document understanding, particularly in optical character recognition (OCR) and multilingual translation. However, current evaluations of LVLMs, like the widely used OCRBench, mainly focus on verifying the correctness of their short-text responses and long-text responses with simple layout, while the evaluation of their ability to understand long texts with complex layout design is highly significant but largely overlooked. In this paper, we propose Menu OCR and Translation Benchmark (MOTBench), a specialized evaluation framework emphasizing the pivotal role of menu translation in cross-cultural communication. MOTBench requires LVLMs to accurately recognize and translate each dish, along with its price and unit items on a menu, providing a comprehensive assessment of their visual understanding and language processing capabilities. Our benchmark is comprised of a collection of Chinese and English menus, characterized by intricate layouts, a variety of fonts, and culturally specific elements across different languages, along with precise human annotations. Experiments show that our automatic evaluation results are highly consistent with professional human evaluation. W e evaluate a range of publicly available state-of-the-art LVLMs, and through analyzing their output to identify the strengths and weaknesses in their performance, offering valuable insights to guide future advancements in LVLM development. MOTBench is available at https://github.com/gitwzl/MOTBench .

large language model, machine learning, translation, (22 more...)

2504.13945

Country: Asia > China (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Dat, Phan Tran Minh, Khang, Vo Hoang Nhat, Tho, Quan Thanh

Towards Cultural Bridge by Bahnaric-Vietnamese Translation Using Transfer Learning of Sequence-To-Sequence Pre-training Language Model

arXiv.org Artificial IntelligenceMay-19-2025

This work explores the journey towards achieving Bahnaric-Vietnamese translation for the sake of culturally bridging the two ethnic groups in Vietnam. However, translating from Bahnaric to Vietnamese also encounters some difficulties. The most prominent challenge is the lack of available original Bahnaric resources source language, including vocabulary, grammar, dialogue patterns and bilingual corpus, which hinders the data collection process for training. To address this, we leverage a transfer learning approach using sequence-to-sequence pre-training language model. First of all, we leverage a pre-trained Vietnamese language model to capture the characteristics of this language. Especially, to further serve the purpose of machine translation, we aim for a sequence-to-sequence model, not encoder-only like BERT or decoder-only like GPT. Taking advantage of significant similarity between the two languages, we continue training the model with the currently limited bilingual resources of Vietnamese-Bahnaric text to perform the transfer learning from language model to machine translation. Thus, this approach can help to handle the problem of imbalanced resources between two languages, while also optimizing the training and computational processes. Additionally, we also enhanced the datasets using data augmentation to generate additional resources and defined some heuristic methods to help the translation more precise. Our approach has been validated to be highly effective for the Bahnaric-Vietnamese translation model, contributing to the expansion and preservation of languages, and facilitating better mutual understanding between the two ethnic people.

machine learning, natural language, translation, (18 more...)

2505.11421

Country:

Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.06)
Asia > Vietnam > Bình Định Province (0.05)
Asia > Vietnam > Kon Tum Province > Kon Tum (0.05)
Asia > Vietnam > Gia Lai Province (0.05)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Madhavi, Hrishit, Cherian, Jacob, Khamkar, Yuvraj, Bhagat, Dhananjay

Low-Resource Language Processing: An OCR-Driven Summarization and Translation Pipeline

arXiv.org Artificial IntelligenceMay-19-2025

With the abundance of information in today's digital world, it is a major challenge to process voluminous text from news articles, reports, and web pages in an efficient manner. Text summarization solves this problem by providing brief, informative summaries of lengthy documents, both saving end-users time and mental effort [1]. Whereas traditional summarization methods involve only extractive approaches (identifying major sentences out of the source text) and abstractive approaches (producing new sentences capturing the core meaning), the current project outlines a holistic, multi-step NLP pipeline extending beyond mere summarization efforts [1]. The pipeline starts with Optical Character Recognition (OCR), which is achieved with Tesseract (Pytesseract). This module yields machine-readable text from images and handles various languages such as English, Hindi, Tamil, Urdu, Bengali, and Telugu [1]. The extracted information then passes through a chain of Natural Language Processing (NLP) and Machine Learning (ML) modules for more in-depth text analysis. The main elements of this pipeline are: The system combines state-of-the-art NLP features to boost text comprehension and processing.

large language model, machine learning, natural language, (14 more...)

2505.11177

Country: Asia > India > Maharashtra > Pune (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.73)
(2 more...)

Thu, Ye Kyaw, Oo, Thazin Myint

Reconstructing Syllable Sequences in Abugida Scripts with Incomplete Inputs

arXiv.org Artificial IntelligenceMay-19-2025

This paper explores syllable sequence prediction in Abugida languages using Transformer-based models, focusing on six languages: Bengali, Hindi, Khmer, Lao, Myanmar, and Thai, from the Asian Language Treebank (ALT) dataset. We investigate the reconstruction of complete syllable sequences from various incomplete input types, including consonant sequences, vowel sequences, partial syllables (with random character deletions), and masked syllables (with fixed syllable deletions). Our experiments reveal that consonant sequences play a critical role in accurate syllable prediction, achieving high BLEU scores, while vowel sequences present a significantly greater challenge. The model demonstrates robust performance across tasks, particularly in handling partial and masked syllable reconstruction, with strong results for tasks involving consonant information and syllable masking. This study advances the understanding of sequence prediction for Abugida languages and provides practical insights for applications such as text prediction, spelling correction, and data augmentation in these scripts.

large language model, machine learning, natural language, (20 more...)

2505.11008

Country:

Asia > Myanmar (0.29)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.96)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)