AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Layer-wise Regularized Dropout for Neural Language Models

Ni, Shiwen, Yang, Min, Xu, Ruifeng, Li, Chengming, Hu, Xiping

arXiv.org Artificial IntelligenceFeb-26-2024

Among the various pre-trained neural language models that are popular today, dropout is already an indispensable regularization technique. To solve the inconsistency between training and inference caused by the randomness of dropout, some studies use consistency training to regularize dropout at the output layer. In this paper, we propose a novel Layer-wise Regularized Dropout (LR-Drop), which is specially designed for Transformer-based Language models. Specifically, LR-Drop layer-wise regularizes each Transformer layer using the consistency training strategy. Each training sample passes through the two siamese sub-models sampled by dropout, and then LR-Drop forces the hidden states, multi-head attention matrices, and output distribution of the two siamese sub-models to be consistent. The proposed LR-Drop can be regarded as a "self-distillation" framework, in which each sub-model generated by dropout is the other's "teacher" model and "student" model. Through extensive experiments on 8 natural language understanding datasets, 6 neural machine translation datasets, and 1 abstractive summarization dataset (a total of 15 datasets), we show that LR-Drop achieves superior performances, including state-of-the-art results.

dataset, lr-drop, regularization, (16 more...)

arXiv.org Artificial Intelligence

2402.16361

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Adaptation of Biomedical and Clinical Pretrained Models to French Long Documents: A Comparative Study

Bazoge, Adrien, Morin, Emmanuel, Daille, Beatrice, Gourraud, Pierre-Antoine

arXiv.org Artificial IntelligenceFeb-26-2024

Recently, pretrained language models based on BERT have been introduced for the French biomedical domain. Although these models have achieved state-of-the-art results on biomedical and clinical NLP tasks, they are constrained by a limited input sequence length of 512 tokens, which poses challenges when applied to clinical notes. In this paper, we present a comparative study of three adaptation strategies for long-sequence models, leveraging the Longformer architecture. We conducted evaluations of these models on 16 downstream tasks spanning both biomedical and clinical domains. Our findings reveal that further pre-training an English clinical model with French biomedical texts can outperform both converting a French biomedical BERT to the Longformer architecture and pre-training a French biomedical Longformer from scratch. The results underscore that long-sequence French biomedical models improve performance across most downstream tasks regardless of sequence length, but BERT based models remain the most efficient for named entity recognition tasks.

architecture, classification, dataset, (15 more...)

arXiv.org Artificial Intelligence

2402.16689

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.05)
North America > Canada > Ontario > Toronto (0.05)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Training a Bilingual Language Model by Mapping Tokens onto a Shared Character Space

Rom, Aviad, Bar, Kfir

arXiv.org Artificial IntelligenceFeb-25-2024

We train a bilingual Arabic-Hebrew language model using a transliterated version of Arabic texts in Hebrew, to ensure both languages are represented in the same script. Given the morphological, structural similarities, and the extensive number of cognates shared among Arabic and Hebrew, we assess the performance of a language model that employs a unified script for both languages, on machine translation which requires cross-lingual knowledge. The results are promising: our model outperforms a contrasting model which keeps the Arabic texts in the Arabic script, demonstrating the efficacy of the transliteration step. Despite being trained on a dataset approximately 60% smaller than that of other existing language models, our model appears to deliver comparable performance in machine translation across both translation directions.

computational linguistic, hebrew, language model, (13 more...)

arXiv.org Artificial Intelligence

2402.16065

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
North America > Dominican Republic (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.72)

Add feedback

Interpreting Predictive Probabilities: Model Confidence or Human Label Variation?

Baan, Joris, Fernández, Raquel, Plank, Barbara, Aziz, Wilker

arXiv.org Artificial IntelligenceFeb-25-2024

With the rise of increasingly powerful and user-facing NLP systems, there is growing interest in assessing whether they have a good representation of uncertainty by evaluating the quality of their predictive distribution over outcomes. We identify two main perspectives that drive starkly different evaluation protocols. The first treats predictive probability as an indication of model confidence; the second as an indication of human label variation. We discuss their merits and limitations, and take the position that both are crucial for trustworthy and fair NLP systems, but that exploiting a single predictive distribution is limiting. We recommend tools and highlight exciting directions towards models with disentangled representations of uncertainty about predictions and uncertainty about human labels.

computational linguistic, linguistic, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2402.16102

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > New York > New York County > New York City (0.05)
(19 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
(2 more...)

Add feedback

TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages

Kim, Minsu, Jung, Jee-weon, Rha, Hyeongseop, Maiti, Soumi, Arora, Siddhant, Chang, Xuankai, Watanabe, Shinji, Ro, Yong Man

arXiv.org Artificial IntelligenceFeb-25-2024

The capability to jointly process multi-modal information is becoming an essential task. However, the limited number of paired multi-modal data and the large computational requirements in multi-modal learning hinder the development. We propose a novel Tri-Modal Translation (TMT) model that translates between arbitrary modalities spanning speech, image, and text. We introduce a novel viewpoint, where we interpret different modalities as different languages, and treat multi-modal translation as a well-established machine translation problem. To this end, we tokenize speech and image data into discrete tokens, which provide a unified interface across modalities and significantly decrease the computational cost. In the proposed TMT, a multi-modal encoder-decoder conducts the core translation, whereas modality-specific processing is conducted only within the tokenization and detokenization stages. We evaluate the proposed TMT on all six modality translation tasks. TMT outperforms single model counterparts consistently, demonstrating that unifying tasks is beneficial not only for practicality but also for performance.

modality, speech, translation, (14 more...)

arXiv.org Artificial Intelligence

2402.16021

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
(2 more...)

Add feedback

Direct Punjabi to English speech translation using discrete units

Kaur, Prabhjot, Bush, L. Andrew M., Shi, Weisong

arXiv.org Artificial IntelligenceFeb-24-2024

Speech-to-speech translation is yet to reach the same level of coverage as text-to-text translation systems. The current speech technology is highly limited in its coverage of over 7000 languages spoken worldwide, leaving more than half of the population deprived of such technology and shared experiences. With voice-assisted technology (such as social robots and speech-to-text apps) and auditory content (such as podcasts and lectures) on the rise, ensuring that the technology is available for all is more important than ever. Speech translation can play a vital role in mitigating technological disparity and creating a more inclusive society. With a motive to contribute towards speech translation research for low-resource languages, our work presents a direct speech-to-speech translation model for one of the Indic languages called Punjabi to English. Additionally, we explore the performance of using a discrete representation of speech called discrete acoustic units as input to the Transformer-based translation model. The model, abbreviated as Unit-to-Unit Translation (U2UT), takes a sequence of discrete units of the source language (the language being translated from) and outputs a sequence of discrete units of the target language (the language being translated to). Our results show that the U2UT model performs better than the Speech-to-Unit Translation (S2UT) model by a 3.69 BLEU score.

discrete unit, speech, translation, (13 more...)

arXiv.org Artificial Intelligence

2402.15967

Country:

North America > United States > Utah (0.04)
North America > United States > Michigan > Wayne County > Dearborn (0.04)
North America > United States > Massachusetts (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Does the Generator Mind its Contexts? An Analysis of Generative Model Faithfulness under Context Transfer

Hu, Xinshuo, Hu, Baotian, Li, Dongfang, Li, Xiaoguang, Shang, Lifeng

arXiv.org Artificial IntelligenceFeb-22-2024

The present study introduces the knowledge-augmented generator, which is specifically designed to produce information that remains grounded in contextual knowledge, regardless of alterations in the context. Previous research has predominantly focused on examining hallucinations stemming from static input, such as in the domains of summarization or machine translation. However, our investigation delves into the faithfulness of generative question answering in the presence of dynamic knowledge. Our objective is to explore the existence of hallucinations arising from parametric memory when contextual knowledge undergoes changes, while also analyzing the underlying causes for their occurrence. In order to efficiently address this issue, we propose a straightforward yet effective measure for detecting such hallucinations. Intriguingly, our investigation uncovers that all models exhibit a tendency to generate previous answers as hallucinations. To gain deeper insights into the underlying causes of this phenomenon, we conduct a series of experiments that verify the critical role played by context in hallucination, both during training and testing, from various perspectives.

computational linguistic, hallucination, knowledge, (16 more...)

arXiv.org Artificial Intelligence

2402.14488

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > Dominican Republic (0.04)
(11 more...)

Genre: Research Report > New Finding (0.47)

Industry: Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.66)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Ar-Spider: Text-to-SQL in Arabic

Almohaimeed, Saleh, Almohaimeed, Saad, Ghanim, Mansour Al, Wang, Liqiang

arXiv.org Artificial IntelligenceFeb-22-2024

In Natural Language Processing (NLP), one of the most important tasks is text-to-SQL semantic parsing, which focuses on enabling users to interact with the database in a more natural manner. In recent years, text-to-SQL has made significant progress, but most were English-centric. In this paper, we introduce Ar-Spider 1, the first Arabic cross-domain text-to-SQL dataset. Due to the unique nature of the language, two major challenges have been encountered, namely schema linguistic and SQL structural challenges. In order to handle these issues and conduct the experiments, we adopt two baseline models LGESQL [4] and S2SQL [12], both of which are tested with two cross-lingual models to alleviate the effects of schema linguistic and SQL structure linking challenges. The baselines demonstrate decent single-language performance on our Arabic text-to-SQL dataset, Ar-Spider, achieving 62.48% for S2SQL and 65.57% for LGESQL, only 8.79% below the highest results achieved by the baselines when trained in English dataset. To achieve better performance on Arabic text-to-SQL, we propose the context similarity relationship (CSR) approach, which results in a significant increase in the overall performance of about 1.52% for S2SQL and 1.06% for LGESQL and closes the gap between Arabic and English languages to 7.73%.

ar-spider, arxiv preprint arxiv, dataset, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3605098.3636065.

2402.15012

Country:

North America > United States > Florida > Orange County > Orlando (0.14)
Europe > Spain > Castile and León > Ávila Province > Ávila (0.05)
North America > United States > Florida > Hillsborough County > University (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.71)

Add feedback

A Classification-Guided Approach for Adversarial Attacks against Neural Machine Translation

Sadrizadeh, Sahar, Dolamic, Ljiljana, Frossard, Pascal

arXiv.org Artificial IntelligenceFeb-22-2024

Neural Machine Translation (NMT) models have been shown to be vulnerable to adversarial attacks, wherein carefully crafted perturbations of the input can mislead the target model. In this paper, we introduce ACT, a novel adversarial attack framework against NMT systems guided by a classifier. In our attack, the adversary aims to craft meaning-preserving adversarial examples whose translations in the target language by the NMT model belong to a different class than the original translations. Unlike previous attacks, our new approach has a more substantial effect on the translation by altering the overall meaning, which then leads to a different class determined by an oracle classifier. To evaluate the robustness of NMT models to our attack, we propose enhancements to existing black-box word-replacement-based attacks by incorporating output translations of the target NMT model and the output logits of a classifier within the attack process. Extensive experiments, including a comparison with existing untargeted attacks, show that our attack is considerably more successful in altering the class of the output translation and has more effect on the translation. This new paradigm can reveal the vulnerabilities of NMT systems by focusing on the class of translation rather than the mere translation quality as studied traditionally.

classifier, nmt model, translation, (15 more...)

arXiv.org Artificial Intelligence

2308.15246

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Switzerland (0.04)
Europe > Germany (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bangla AI: A Framework for Machine Translation Utilizing Large Language Models for Ethnic Media

Goni, MD Ashraful, Mostafa, Fahad, Kee, Kerk F.

arXiv.org Artificial IntelligenceFeb-21-2024

Ethnic media, which caters to diaspora communities in host nations, serves as a vital platform for these communities to both produce content and access information. Rather than utilizing the language of the host nation, ethnic media delivers news in the language of the immigrant community. For instance, in the USA, Bangla ethnic media presents news in Bangla rather than English. This research delves into the prospective integration of large language models (LLM) and multi-lingual machine translations (MMT) within the ethnic media industry. It centers on the transformative potential of using LLM in MMT in various facets of news translation, searching, and categorization. The paper outlines a theoretical framework elucidating the integration of LLM and MMT into the news searching and translation processes for ethnic media. Additionally, it briefly addresses the potential ethical challenges associated with the incorporation of LLM and MMT in news translation procedures.

ethnic media, information, llm and mmt, (10 more...)

arXiv.org Artificial Intelligence

2402.14179

Country:

North America > United States > New York (0.07)
North America > United States > Texas (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Bangladesh (0.04)

Genre: Research Report (1.00)

Industry:

Media > News (1.00)
Government > Regional Government (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback