AITopics

2412.15471

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Middle East (0.68)

Genre:

Overview (0.86)
Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Furqan, Mohammed, Khaja, Raahid Bin, Habeeb, Rayyan

ERUPD -- English to Roman Urdu Parallel Dataset

Bridging linguistic gaps fosters global growth and cultural exchange. This study addresses the challenges of Roman Urdu -- a Latin-script adaptation of Urdu widely used in digital communication -- by creating a novel parallel dataset comprising 75,146 sentence pairs. Roman Urdu's lack of standardization, phonetic variability, and code-switching with English complicates language processing. We tackled this by employing a hybrid approach that combines synthetic data generated via advanced prompt engineering with real-world conversational data from personal messaging groups. We further refined the dataset through a human evaluation phase, addressing linguistic inconsistencies and ensuring accuracy in code-switching, phonetic representations, and synonym variability. The resulting dataset captures Roman Urdu's diverse linguistic features and serves as a critical resource for machine translation, sentiment analysis, and multilingual education.

large language model, machine learning, natural language, (16 more...)

2412.17562

Country:

Asia > India > Telangana > Hyderabad (0.05)
Europe > Czechia > Prague (0.04)
Asia > Middle East > Oman (0.04)
(4 more...)

Genre:

Research Report (0.65)
Overview (0.46)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Saunders, Danielle, DeNeefe, Steve

Domain adapted machine translation: What does catastrophic forgetting forget and why?

Neural Machine Translation (NMT) models can be specialized by domain adaptation, often involving fine-tuning on a dataset of interest. This process risks catastrophic forgetting: rapid loss of generic translation quality. Forgetting has been widely observed, with many mitigation methods proposed. However, the causes of forgetting and the relationship between forgetting and adaptation data are under-explored. This paper takes a novel approach to understanding catastrophic forgetting during NMT adaptation by investigating the impact of the data. We provide a first investigation of what is forgotten, and why. We examine the relationship between forgetting and the in-domain data, and show that the amount and type of forgetting is linked to that data's target vocabulary coverage. Our findings pave the way toward better informed NMT domain adaptation.

adaptation, computational linguistic, translation, (14 more...)

2412.17537

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
(17 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

2M-BELEBELE: Highly Multilingual Speech and American Sign Language Comprehension Dataset

Costa-jussà, Marta R., Yu, Bokai, Andrews, Pierre, Alastruey, Belen, Camgoz, Necati Cihan, Chuang, Joe, Maillard, Jean, Ropers, Christophe, Turkantenko, Arina, Wood, Carleigh

We introduce the first highly multilingual speech and American Sign Language (ASL) comprehension dataset by extending BELEBELE. Our dataset covers 74 spoken languages at the intersection of BELEBELE and FLEURS, and one sign language (ASL). We evaluate 2M-BELEBELE dataset for both 5-shot and zero-shot settings and across languages, the speech comprehension accuracy is ~ 2-3% average lower compared to reading comprehension.

dataset, indo-european, latn latn indo-european, (10 more...)

2412.08274

Country:

Africa > Niger (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report (0.64)

Industry: Education > Curriculum > Subject-Specific Education (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.46)

Ensuring Consistency for In-Image Translation

Fu, Chengpeng, Feng, Xiaocheng, Huang, Yichong, Huo, Wenshuai, Li, Baohang, Zhang, Zhirui, Lu, Yunfei, Tu, Dandan, Tang, Duyu, Wang, Hui, Qin, Bing, Liu, Ting

The in-image machine translation task involves translating text embedded within images, with the translated results presented in image format. While this task has numerous applications in various scenarios such as film poster translation and everyday scene image translation, existing methods frequently neglect the aspect of consistency throughout this process. We propose the need to uphold two types of consistency in this task: translation consistency and image generation consistency. The former entails incorporating image information during translation, while the latter involves maintaining consistency between the style of the text-image and the original image, ensuring background integrity. To address these consistency requirements, we introduce a novel two-stage framework named HCIIT (High-Consistency In-Image Translation) which involves text-image translation using a multimodal multilingual large language model in the first stage and image backfilling with a diffusion model in the second stage. Chain of thought learning is utilized in the first stage to enhance the model's ability to leverage image information during translation. Subsequently, a diffusion model trained for style-consistent text-image generation ensures uniformity in text style within images and preserves background details. A dataset comprising 400,000 style-consistent pseudo text-image pairs is curated for model training. Results obtained on both curated test sets and authentic image test sets validate the effectiveness of our framework in ensuring consistency and producing high-quality translated images.

machine learning, natural language, translation, (18 more...)

2412.18139

Country: Asia > China (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Peng, Ziqian, Bawden, Rachel, Yvon, François

Investigating Length Issues in Document-level Machine Translation

Transformer architectures are increasingly effective at processing and generating very long chunks of texts, opening new perspectives for document-level machine translation (MT). In this work, we challenge the ability of MT systems to handle texts comprising up to several thousands of tokens. We design and implement a new approach designed to precisely measure the effect of length increments on MT outputs. Our experiments with two representative architectures unambiguously show that (a)~translation performance decreases with the length of the input text; (b)~the position of sentences within the document matters and translation quality is higher for sentences occurring earlier in a document. We further show that manipulating the distribution of document lengths and of positional embeddings only marginally mitigates such problems. Our results suggest that even though document-level MT is computationally feasible, it does not yet match the performance of sentence-based MT.

computational linguistic, machine learning, natural language, (17 more...)

2412.17592

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

arXiv.org Artificial IntelligenceDec-22-2024

Learning to Adapt to Low-Resource Paraphrase Generation

Li, Zhigen, Wang, Yanmeng, Fan, Rizhao, Wang, Ye, Li, Jianfeng, Wang, Shaojun

Paraphrase generation is a longstanding NLP task and achieves great success with the aid of large corpora. However, transferring a paraphrasing model to another domain encounters the problem of domain shifting especially when the data is sparse. At the same time, widely using large pre-trained language models (PLMs) faces the overfitting problem when training on scarce labeled data. To mitigate these two issues, we propose, LAPA, an effective adapter for PLMs optimized by meta-learning. LAPA has three-stage training on three types of related resources to solve this problem: 1. pre-training PLMs on unsupervised corpora, 2. inserting an adapter layer and meta-training on source domain labeled data, and 3. fine-tuning adapters on a small amount of target domain labeled data. This method enables paraphrase generation models to learn basic language knowledge first, then learn the paraphrasing task itself later, and finally adapt to the target task. Our experimental results demonstrate that LAPA achieves state-of-the-art in supervised, unsupervised, and low-resource settings on three benchmark datasets. With only 2\% of trainable parameters and 1\% labeled data of the target task, our approach can achieve a competitive performance with previous work.

computational linguistic, machine learning, natural language, (14 more...)

2412.17111

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > China > Hong Kong (0.04)
(15 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Rathnayake, Charitha, Thilakarathna, P. R. S., Nethmini, Uthpala, Kaur, Rishemjith, Ranathunga, Surangika

Unsupervised Bilingual Lexicon Induction for Low Resource Languages

arXiv.org Artificial IntelligenceDec-22-2024

Bilingual lexicons play a crucial role in various Natural Language Processing tasks. However, many low-resource languages (LRLs) do not have such lexicons, and due to the same reason, cannot benefit from the supervised Bilingual Lexicon Induction (BLI) techniques. To address this, unsupervised BLI (UBLI) techniques were introduced. A prominent technique in this line is structure-based UBLI. It is an iterative method, where a seed lexicon, which is initially learned from monolingual embeddings is iteratively improved. There have been numerous improvements to this core idea, however they have been experimented with independently of each other. In this paper, we investigate whether using these techniques simultaneously would lead to equal gains. We use the unsupervised version of VecMap, a commonly used structure-based UBLI framework, and carry out a comprehensive set of experiments using the LRL pairs, English-Sinhala, English-Tamil, and English-Punjabi. These experiments helped us to identify the best combination of the extensions. We also release bilingual dictionaries for English-Sinhala and English-Punjabi.

bilingual lexicon induction, machine learning, natural language, (14 more...)

2412.16894

Country:

Asia (0.46)
Oceania (0.28)
Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Yousofi, Waisullah, Bhattacharyya, Pushpak

Reconsidering SMT Over NMT for Closely Related Languages: A Case Study of Persian-Hindi Pair

arXiv.org Artificial IntelligenceDec-22-2024

This paper demonstrates that Phrase-Based Statistical Machine Translation (PBSMT) can outperform Transformer-based Neural Machine Translation (NMT) in moderate-resource scenarios, specifically for structurally similar languages, like the Persian-Hindi pair. Despite the Transformer architecture's typical preference for large parallel corpora, our results show that PBSMT achieves a BLEU score of 66.32, significantly exceeding the Transformer-NMT score of 53.7 on the same dataset. Additionally, we explore variations of the SMT architecture, including training on Romanized text and modifying the word order of Persian sentences to match the left-to-right (LTR) structure of Hindi. Our findings highlight the importance of choosing the right architecture based on language pair characteristics and advocate for SMT as a high-performing alternative, even in contexts commonly dominated by NMT.

artificial intelligence, machine learning, natural language, (19 more...)

2412.16877

Country:

Europe (1.00)
North America > United States > Colorado (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Goncalves, Lucas, Mathur, Prashant, Niu, Xing, Houston, Brady, Lavania, Chandrashekhar, Vishnubhotla, Srikanth, Sun, Lijia, Ferritto, Anthony

Improving Lip-synchrony in Direct Audio-Visual Speech-to-Speech Translation

arXiv.org Artificial IntelligenceDec-21-2024

Audio-Visual Speech-to-Speech Translation typically prioritizes improving translation quality and naturalness. However, an equally critical aspect in audio-visual content is lip-synchrony-ensuring that the movements of the lips match the spoken content-essential for maintaining realism in dubbed videos. Despite its importance, the inclusion of lip-synchrony constraints in AVS2S models has been largely overlooked. This study addresses this gap by integrating a lip-synchrony loss into the training process of AVS2S models. Our proposed method significantly enhances lip-synchrony in direct audio-visual speech-to-speech translation, achieving an average LSE-D score of 10.67, representing a 9.2% reduction in LSE-D over a strong baseline across four language pairs. Additionally, it maintains the naturalness and high quality of the translated speech when overlaid onto the original video, without any degradation in translation quality.

artificial intelligence, natural language, translation, (16 more...)

2412.1653

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)