AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

A Cartoon Guide to Language Models in NLP (Part 1: Intuition)

#artificialintelligenceNov-14-2021, 02:45:07 GMT

(This is a crosspost from the official Surge AI blog. If you need help with data labeling and NLP, say hello!) Language models are a core component of NLP systems, from machine translation to speech…

cheeseburger, language model, robot, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.70)

Add feedback

DEEP: DEnoising Entity Pre-training for Neural Machine Translation

Hu, Junjie, Hayashi, Hiroaki, Cho, Kyunghyun, Neubig, Graham

arXiv.org Artificial IntelligenceNov-14-2021

It has been shown that machine translation models usually generate poor translations for named entities that are infrequent in the training corpus. Earlier named entity translation methods mainly focus on phonetic transliteration, which ignores the sentence context for translation and is limited in domain and language coverage. To address this limitation, we propose DEEP, a DEnoising Entity Pre-training method that leverages large amounts of monolingual data and a knowledge base to improve named entity translation accuracy within sentences. Besides, we investigate a multi-task learning strategy that finetunes a pre-trained neural machine translation model on both entity-augmented monolingual data and parallel data to further improve entity translation. Experimental results on three language pairs demonstrate that \method results in significant improvements over strong denoising auto-encoding baselines, with a gain of up to 1.3 BLEU and up to 9.2 entity accuracy points for English-Russian translation.

entity translation, proceedings, translation, (14 more...)

arXiv.org Artificial Intelligence

2111.07393

Country:

Europe > Russia > Southern Federal District > Krasnodar Krai > Krasnodar (0.05)
Europe > Russia > Volga Federal District > Ulyanovsk Oblast > Ulyanovsk (0.05)
Europe > Russia > Volga Federal District > Saratov Oblast > Saratov (0.05)
(4 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Attention Mechanism in Vision Models

#artificialintelligenceNov-13-2021, 00:55:45 GMT

In this article, we would like to explore the attention mechanism and subsequently understand its application in vision models. Attention was first introduced in the paper by Bahdanau et al. for neural machine translation. Attention is a technique that enables a network to focus better on the parts of the input data that is more important to making a prediction. Since being introduced, it has revolutionized the entire field of NLP by being a key component in all the state-of-the-art models for a variety of tasks. The first paper we are discussing is'Attention Is All You Need' published by Google Brain.

architecture, attention mechanism, machine translation, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)

Add feedback

Adapting machine translation models to new genres

#artificialintelligenceNov-8-2021, 15:40:42 GMT

Neural machine translation systems are often optimized to perform well for specific text genres or domains, such as newspaper articles, user manuals, or customer support chats. In industrial settings with hundreds of language pairs to serve, however, a single translation system per language pair, which performs well across different text domains, is more efficient to deploy and maintain. Additionally, service providers may not know in advance which domains customers will be interested in. At this year's Conference on Empirical Methods in Natural Language Processing (EMNLP), we are presenting a new approach to multidomain adaptation for neural translation models, or adapting an existing model to new domains while maintaining translation quality in the original domain. Our approach provides a better trade-off between performance on old and new tasks than its predecessors do.

news article, translation model, translation system, (16 more...)

#artificialintelligence

Genre: Research Report > New Finding (0.32)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Transformer Based Bengali Chatbot Using General Knowledge Dataset

Masum, Abu Kaisar Mohammad, Abujar, Sheikh, Akter, Sharmin, Ria, Nushrat Jahan, Hossain, Syed Akhter

arXiv.org Artificial IntelligenceNov-8-2021

An AI chatbot provides an impressive response after learning from the trained dataset. In this decade, most of the research work demonstrates that deep neural models superior to any other model. RNN model regularly used for determining the sequence-related problem like a question and it answers. This approach acquainted with everyone as seq2seq learning. In a seq2seq model mechanism, it has encoder and decoder. The encoder embedded any input sequence, and the decoder embedded output sequence. For reinforcing the seq2seq model performance, attention mechanism added into the encoder and decoder. After that, the transformer model has introduced itself as a high-performance model with multiple attention mechanism for solving the sequence-related dilemma. This model reduces training time compared with RNN based model and also achieved state-of-the-art performance for sequence transduction. In this research, we applied the transformer model for Bengali general knowledge chatbot based on the Bengali general knowledge Question Answer (QA) dataset. It scores 85.0 BLEU on the applied QA data. To check the comparison of the transformer model performance, we trained the seq2seq model with attention on our dataset that scores 23.5 BLEU.

chatbot, decoder, transformer model, (10 more...)

arXiv.org Artificial Intelligence

2111.03937

Country: Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.97)

Add feedback

Flight Demand Forecasting with Transformers

Wang, Liya, Mykityshyn, Amy, Johnson, Craig, Cheng, Jillian

arXiv.org Artificial IntelligenceNov-4-2021

Transformers have become the de-facto standard in the natural language processing (NLP) field. They have also gained momentum in computer vision and other domains. Transformers can enable artificial intelligence (AI) models to dynamically focus on certain parts of their input and thus reason more effectively. Inspired by the success of transformers, we adopted this technique to predict strategic flight departure demand in multiple horizons. This work was conducted in support of a MITRE-developed mobile application, Pacer, which displays predicted departure demand to general aviation (GA) flight operators so they can have better situation awareness of the potential for departure delays during busy periods. Field demonstrations involving Pacer's previously designed rule-based prediction method showed that the prediction accuracy of departure demand still has room for improvement. This research strives to improve prediction accuracy from two key aspects: better data sources and robust forecasting algorithms. We leveraged two data sources, Aviation System Performance Metrics (ASPM) and System Wide Information Management (SWIM), as our input. We then trained forecasting models with temporal fusion transformer (TFT) for five different airports. Case studies show that TFTs can perform better than traditional forecasting methods by large margins, and they can result in better prediction across diverse airports and with better interpretability.

airport, prediction, tft model, (14 more...)

arXiv.org Artificial Intelligence

2111.04471

Country:

North America > United States > Virginia > Fairfax County > McLean (0.04)
North America > United States > New York (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Infrastructure & Services > Airport (1.00)
Transportation > Air (1.00)
Government > Regional Government > North America Government > United States Government (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Add feedback

Lingua Custodia's participation at the WMT 2021 Machine Translation using Terminologies shared task

Ailem, Melissa, Liu, Jinghsu, Qader, Raheel

arXiv.org Artificial IntelligenceNov-3-2021

This paper describes Lingua Custodia's submission to the WMT21 shared task on machine translation using terminologies. We consider three directions, namely English to French, Russian, and Chinese. We rely on a Transformer-based architecture as a building block, and we explore a method which introduces two main changes to the standard procedure to handle terminologies. The first one consists in augmenting the training data in such a way as to encourage the model to learn a copy behavior when it encounters terminology constraint terms. The second change is constraint token masking, whose purpose is to ease copy behavior learning and to improve model generalization. Empirical results show that our method satisfies most terminology constraints while maintaining high translation quality.

constraint, terminology, translation, (11 more...)

arXiv.org Artificial Intelligence

2111.0212

Country:

North America > Canada (0.06)
Europe > France (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (0.69)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.76)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

UQuAD1.0: Development of an Urdu Question Answering Training Data for Machine Reading Comprehension

Kazi, Samreen, Khoja, Shakeel

arXiv.org Artificial IntelligenceNov-2-2021

In recent years, low-resource Machine Reading Comprehension (MRC) has made significant progress, with models getting remarkable performance on various language datasets. However, none of these models have been customized for the Urdu language. This work explores the semi-automated creation of the Urdu Question Answering Dataset (UQuAD1.0) by combining machine-translated SQuAD with human-generated samples derived from Wikipedia articles and Urdu RC worksheets from Cambridge O-level books. UQuAD1.0 is a large-scale Urdu dataset intended for extractive machine reading comprehension tasks consisting of 49k question Answers pairs in question, passage, and answer format. In UQuAD1.0, 45000 pairs of QA were generated by machine translation of the original SQuAD1.0 and approximately 4000 pairs via crowdsourcing. In this study, we used two types of MRC models: rule-based baseline and advanced Transformer-based models. However, we have discovered that the latter outperforms the others; thus, we have decided to concentrate solely on Transformer-based architectures. Using XLMRoBERTa and multi-lingual BERT, we acquire an F1 score of 0.66 and 0.63, respectively.

comprehension, dataset, uquad1, (13 more...)

arXiv.org Artificial Intelligence

2111.01543

Country:

North America > United States > Texas > Harris County > Houston (0.14)
Asia > Pakistan > Punjab > Lahore Division > Lahore (0.04)
Indian Ocean > Arabian Sea (0.04)
(8 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Government (0.93)
Education > Assessment & Standards > Student Performance (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.72)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

How should human translation coexist with NMT? Efficient tool for building high quality parallel corpus

Park, Chanjun, Lee, Seolhwa, Moon, Hyeonseok, Eo, Sugyeong, Seo, Jaehyung, Lim, Heuiseok

arXiv.org Artificial IntelligenceOct-30-2021

This paper proposes a tool for efficiently constructing high-quality parallel corpora with minimizing human labor and making this tool publicly available. Our proposed construction process is based on neural machine translation (NMT) to allow for it to not only coexist with human translation, but also improve its efficiency by combining data quality control with human translation in a data-centric approach.

corpus, parallel corpus, translation, (10 more...)

arXiv.org Artificial Intelligence

2111.00191

Country:

Oceania > Australia > New South Wales > Sydney (0.05)
Europe > Denmark > Capital Region > Copenhagen (0.05)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Understanding How Encoder-Decoder Architectures Attend

Aitken, Kyle, Ramasesh, Vinay V, Cao, Yuan, Maheswaranathan, Niru

arXiv.org Machine LearningOct-28-2021

Encoder-decoder networks with attention have proven to be a powerful way to solve many sequence-to-sequence tasks. In these networks, attention aligns encoder and decoder states and is often used for visualizing network behavior. However, the mechanisms used by networks to generate appropriate attention matrices are still mysterious. Moreover, how these mechanisms vary depending on the particular architecture used for the encoder and decoder (recurrent, feed-forward, etc.) are also not well understood. In this work, we investigate how encoder-decoder networks solve different sequence-to-sequence tasks. We introduce a way of decomposing hidden states over a sequence into temporal (independent of input) and input-driven (independent of sequence position) components. This reveals how attention matrices are formed: depending on the task requirements, networks rely more heavily on either the temporal or input-driven components. These findings hold across both recurrent and feed-forward architectures despite their differences in forming the temporal components. Overall, our results provide new insight into the inner workings of attention-based encoder-decoder networks.

architecture, input component, temporal component, (14 more...)

arXiv.org Machine Learning

2110.15253

Country:

North America > United States > California > Santa Clara County > Mountain View (0.04)
North America > United States > Washington > King County > Seattle (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Communications (0.66)

Add feedback