AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

How Effective is Byte Pair Encoding for Out-Of-Vocabulary Words in Neural Machine Translation?

Araabi, Ali, Monz, Christof, Niculae, Vlad

arXiv.org Artificial IntelligenceAug-17-2022

Neural Machine Translation (NMT) is an open vocabulary problem. As a result, dealing with the words not occurring during training (a.k.a. out-of-vocabulary (OOV) words) have long been a fundamental challenge for NMT systems. The predominant method to tackle this problem is Byte Pair Encoding (BPE) which splits words, including OOV words, into sub-word segments. BPE has achieved impressive results for a wide range of translation tasks in terms of automatic evaluation metrics. While it is often assumed that by using BPE, NMT systems are capable of handling OOV words, the effectiveness of BPE in translating OOV words has not been explicitly measured. In this paper, we study to what extent BPE is successful in translating OOV words at the word-level. We analyze the translation quality of OOV words based on word type, number of segments, cross-attention weights, and the frequency of segment n-grams in the training data. Our experiments show that while careful BPE settings seem to be fairly useful in translating OOV words across datasets, a considerable percentage of OOV words are translated incorrectly. Furthermore, we highlight the slightly higher effectiveness of BPE in translating OOV words for special cases, such as named-entities and when the languages involved are linguistically close to each other.

oov word, translation, translation quality, (12 more...)

arXiv.org Artificial Intelligence

2208.05225

Country:

Europe > Italy > Tuscany > Florence (0.04)
Europe > Germany > Berlin (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(21 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

A Bidirectional Tree Tagging Scheme for Joint Medical Relation Extraction

Luo, Xukun, Liu, Weijie, Ma, Meng, Wang, Ping

arXiv.org Artificial IntelligenceAug-17-2022

Joint medical relation extraction refers to extracting triples, composed of entities and relations, from the medical text with a single model. One of the solutions is to convert this task into a sequential tagging task. However, in the existing works, the methods of representing and tagging the triples in a linear way failed to the overlapping triples, and the methods of organizing the triples as a graph faced the challenge of large computational effort. In this paper, inspired by the tree-like relation structures in the medical text, we propose a novel scheme called Bidirectional Tree Tagging (BiTT) to form the medical relation triples into two two binary trees and convert the trees into a word-level tags sequence. Based on BiTT scheme, we develop a joint relation extraction model to predict the BiTT tags and further extract medical triples efficiently. Our model outperforms the best baselines by 2.0\% and 2.5\% in F1 score on two medical datasets. What's more, the models with our BiTT scheme also obtain promising results in three public datasets of other domains.

dataset, extraction, proceedings, (17 more...)

arXiv.org Artificial Intelligence

2008.13339

Country: Asia > China > Beijing > Beijing (0.05)

Genre: Research Report (0.82)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

Reproduction and Replication of an Adversarial Stylometry Experiment

Wang, Haining, Juola, Patrick, Riddell, Allen

arXiv.org Artificial IntelligenceAug-15-2022

Maintaining anonymity while communicating using natural language remains a challenge. Standard authorship attribution techniques that analyze candidate authors' writing styles achieve uncomfortably high accuracy even when the number of candidate authors is high. Adversarial stylometry defends against authorship attribution with the goal of preventing unwanted deanonymization. This paper reproduces and replicates experiments in a seminal study of defenses against authorship attribution (Brennan et al., 2012). We are able to successfully reproduce and replicate the original results, although we conclude that the effectiveness of the defenses studied is overstated due to a lack of a control group in the original study. In our replication, we find new evidence suggesting that an entirely automatic method, round-trip translation, merits re-examination as it appears to reduce the effectiveness of established authorship attribution methods.

authorship attribution, participant, translation, (14 more...)

arXiv.org Artificial Intelligence

2208.07395

Country:

North America > United States > Indiana > Monroe County > Bloomington (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > Colorado > Denver County > Denver (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)

Add feedback

Fast Vocabulary Projection Method via Clustering for Multilingual Machine Translation on GPU

Amer, Hossam, Kim, Young Jin, Afify, Mohamed, Matsushita, Hitokazu, Awadallah, Hany Hassan

arXiv.org Artificial IntelligenceAug-14-2022

Multilingual Neural Machine Translation has been showing great success using transformer models. Deploying these models is challenging because they usually require large vocabulary (vocab) sizes for various languages. This limits the speed of predicting the output tokens in the last vocab projection layer. To alleviate these challenges, this paper proposes a fast vocabulary projection method via clustering which can be used for multilingual transformers on GPUs. First, we offline split the vocab search space into disjoint clusters given the hidden context vector of the decoder output, which results in much smaller vocab columns for vocab projection. Second, at inference time, the proposed method predicts the clusters and candidate active tokens for hidden context vectors at the vocab projection. This paper also includes analysis of different ways of building these clusters in multilingual settings. Our results show end-to-end speed gains in float16 GPU inference up to 25% while maintaining the BLEU score and slightly increasing memory cost. The proposed method speeds up the vocab projection step itself by up to 2.6x. We also conduct an extensive human evaluation to verify the proposed method preserves the quality of the translations from the original model.

context vector, language direction, vocab projection, (11 more...)

arXiv.org Artificial Intelligence

2208.06874

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Spain (0.04)

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Root-aligned SMILES: A Tight Representation for Chemical Reaction Prediction

Zhong, Zipeng, Song, Jie, Feng, Zunlei, Liu, Tiantao, Jia, Lingxiang, Yao, Shaolun, Wu, Min, Hou, Tingjun, Song, Mingli

arXiv.org Artificial IntelligenceAug-12-2022

Chemical reaction prediction, involving forward synthesis and retrosynthesis prediction, is a fundamental problem in organic synthesis. A popular computational paradigm formulates synthesis prediction as a sequence-to-sequence translation problem, where the typical SMILES is adopted for molecule representations. However, the general-purpose SMILES neglects the characteristics of chemical reactions, where the molecular graph topology is largely unaltered from reactants to products, resulting in the suboptimal performance of SMILES if straightforwardly applied. In this article, we propose the root-aligned SMILES (R-SMILES), which specifies a tightly aligned one-to-one mapping between the product and the reactant SMILES for more efficient synthesis prediction. Due to the strict one-to-one mapping and reduced edit distance, the computational model is largely relieved from learning the complex syntax and dedicated to learning the chemical knowledge for reactions. We compare the proposed R-SMILES with various state-of-the-art baselines and show that it significantly outperforms them all, demonstrating the superiority of the proposed method.

prediction, r-smile, reaction, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1039/D2SC02763A

2203.11444

Country: Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.88)

Add feedback

Domain-Specific Text Generation for Machine Translation

Moslem, Yasmin, Haque, Rejwanul, Kelleher, John D., Way, Andy

arXiv.org Artificial IntelligenceAug-11-2022

Preservation of domain knowledge from the source to target is crucial in any translation workflow. It is common in the translation industry to receive highly specialized projects, where there is hardly any parallel in-domain data. In such scenarios where there is insufficient in-domain data to fine-tune Machine Translation (MT) models, producing translations that are consistent with the relevant context is challenging. In this work, we propose a novel approach to domain adaptation leveraging state-of-the-art pretrained language models (LMs) for domain-specific data augmentation for MT, simulating the domain characteristics of either (a) a small bilingual dataset, or (b) the monolingual source text to be translated. Combining this idea with back-translation, we can generate huge amounts of synthetic bilingual in-domain data for both use cases. For our investigation, we use the state-of-the-art Transformer architecture. We employ mixed fine-tuning to train models that significantly improve translation of in-domain texts. More specifically, in both scenarios, our proposed methods achieve improvements of approximately 5-6 BLEU and 2-3 BLEU, respectively, on the Arabic-to-English and English-to-Arabic language pairs. Furthermore, the outcome of human evaluation corroborates the automatic evaluation results.

computational linguistic, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2208.05909

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
(21 more...)

Genre:

Research Report (1.00)
Workflow (0.88)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Language Tokens: A Frustratingly Simple Approach Improves Zero-Shot Performance of Multilingual Translation

ElNokrashy, Muhammad, Hendy, Amr, Maher, Mohamed, Afify, Mohamed, Awadalla, Hany Hassan

arXiv.org Artificial IntelligenceAug-11-2022

Neural machine translation (NMT) has witnessed significant advances since the introduction of the transformer model (Vaswani et al., 2017). This model has shown impressive performance for bilingual translation commonly from and to English (Hassan et al., 2018). It has also been shown that the proposed model could be easily extended to multiple language pairs (Aharoni, Johnson, & Firat, 2019; Fan et al., 2020; Johnson et al., 2017; X. Wang, Tsvetkov, & Neubig, 2020), to and/or from English, by simple modifications to the basic architecture. This holds promise for improved performance for low-resource pairs through transfer learning, as well as better training and deployment costs per language pair. This setting is referred to as multilingual neural machine translation (MNMT). The mainstream method of training MNMT is to introduce an additional input tag at the encoder to indicate the target language, while the decoder uses the usual begin-of-sentence (BOS) token. This simple modification to the bilingual architecture is shown to work well up to hundreds of language pairs (Fan et al., 2020; Tran et al., 2021), given a corresponding increase in the number of parameters to handle the increased training data. Despite the emergence of modified architectures which add language-specific parameters, like language specific subnetworks (LASS) (Lin, Wu, Wang, & Li, 2021), and adapters (Bapna & Firat, 2019), the basic architecture remains the most effective choice for deploying large scale production systems.

machine translation, retrieved, translation, (15 more...)

arXiv.org Artificial Intelligence

2208.05852

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(8 more...)

Genre: Research Report > Experimental Study (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages

Soulos, Paul, Rao, Sudha, Smith, Caitlin, Rosen, Eric, Celikyilmaz, Asli, McCoy, R. Thomas, Jiang, Yichen, Haley, Coleman, Fernandez, Roland, Palangi, Hamid, Gao, Jianfeng, Smolensky, Paul

arXiv.org Artificial IntelligenceAug-11-2022

The task of machine translation has seen major progress in recent times with the advent of large-scale Transformer-based models (e.g., Vaswani et al., 2017; Dehghani et al., 2019; Liu et al., 2020a). However, there has been less progress on language pairs that specifically involve morphologically rich languages. Moreover, although there has been previous work that builds linguistic structure into translation models to deal with morphological complexity (Sennrich and Haddow, 2016; Dalvi et al., 2017; Matthews et al., 2018), to the best to our knowledge there has not been work that applies such strategies to large-scale Transformer-based models. We hypothesize that providing Transformers access to structured linguistic representations can significantly boost their performance on translation into languages with complex morphology that encodes linguistic structure. In this work, we investigate two methods for introducing such structural bias into Transformer-based models. In the first method, we use the TP-Transformer (TPT) (Schlag et al., 2019), in which a traditional Transformer is augmented with Tensor Product Representations (TPRs) (Smolensky, 1990) ( 2).

computational linguistic, proceedings, translation, (8 more...)

arXiv.org Artificial Intelligence

2208.06061

Country:

Asia > Middle East > Republic of Türkiye (0.14)
North America > Canada > Nunavut (0.04)
Europe > United Kingdom > England (0.04)
(11 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Important Uses of AI in Translation

#artificialintelligenceAug-10-2022, 13:10:33 GMT

Before AI came into use, translation was a job that was time-consuming, well-paid, and required a high level of education. Thanks to AI, translation software makes translating a common service that is instant, free, and convenient. In this article, we will explore what machine translation is, how AI improves the industry, and why AI-powered software cannot replace human translators. Machine Translation uses AI-powered software to automatically translate the language in the source material to another language, without any interventions from human agents. In 1970, the first machine translation software was developed.

software, translation, translation software, (14 more...)

#artificialintelligence

Country: North America > United States > New York (0.05)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Graph Neural Networks for Multiparallel Word Alignment

Imani, Ayyoob, Şenel, Lütfi Kerem, Sabet, Masoud Jalili, Yvon, François, Schütze, Hinrich

arXiv.org Artificial IntelligenceAug-10-2022

After a period of decrease, interest in word alignments is increasing again for their usefulness in domains such as typological research, cross-lingual annotation projection, and machine translation. Generally, alignment algorithms only use bitext and do not make use of the fact that many parallel corpora are multiparallel. Here, we compute high-quality word alignments between multiple language pairs by considering all language pairs together. First, we create a multiparallel word alignment graph, joining all bilingual word alignment pairs in one graph. Next, we use graph neural networks (GNNs) to exploit the graph structure. Our GNN approach (i) utilizes information about the meaning, position, and language of the input words, (ii) incorporates information from multiple parallel sentences, (iii) adds and removes edges from the initial alignments, and (iv) yields a prediction model that can generalize beyond the training sentences. We show that community detection provides valuable information for multiparallel word alignment. Our method outperforms previous work on three word-alignment datasets and on a downstream task.

alignment, computational linguistic, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2203.08654

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Czechia > Prague (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
(19 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback