AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

How to stop AI from perpetuating harmful biases

#artificialintelligenceApr-22-2020, 17:08:39 GMT

Artificial Intelligence (AI) is already re-configuring the world in conspicuous ways. Data drives our global digital ecosystem, and AI technologies reveal patterns in data. Smartphones, smart homes, and smart cities influence how we live and interact, and AI systems are increasingly involved in recruitment decisions, medical diagnoses, and judicial verdicts. Whether this scenario is utopian or dystopian depends on your perspective. The potential risks of AI are enumerated repeatedly.

ai system, training data, translation system, (10 more...)

#artificialintelligence

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.06)

Industry: Information Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.97)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.56)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.35)

Add feedback

Discretized Bottleneck in VAE: Posterior-Collapse-Free Sequence-to-Sequence Learning

Zhao, Yang, Yu, Ping, Mahapatra, Suchismit, Su, Qinliang, Chen, Changyou

arXiv.org Machine LearningApr-22-2020

Variational autoencoders (VAEs) are important tools in end-to-end representation learning. VAEs can capture complex data distributions and have been applied extensively in many natural-language-processing (NLP) tasks. However, a common pitfall in sequence-to-sequence learning with VAEs is the posterior-collapse issue in latent space, wherein the model tends to ignore latent variables when a strong auto-regressive decoder is implemented. In this paper, we propose a principled approach to eliminate this issue by applying a discretized bottleneck in the latent space. Specifically, we impose a shared discrete latent space where each input is learned to choose a combination of shared latent atoms as its latent representation. Compared with VAEs employing continuous latent variables, our model endows more promising capability in modeling underlying semantics of discrete sequences and can thus provide more interpretative latent structures. Empirically, we demonstrate the efficiency and effectiveness of our model on a broad range of tasks, including language modeling, unaligned text style transfer, dialog response generation, and neural machine translation.

arxiv preprint arxiv, autoencoder, codebook, (14 more...)

arXiv.org Machine Learning

2004.10603

Country: Asia > Vietnam > Hanoi > Hanoi (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

Add feedback

Could AI make language learning obsolete?

#artificialintelligenceApr-16-2020, 07:17:32 GMT

Perhaps we can expect an iPhone-like symphonic progression in models here? Many companies are throwing their hat into the translation technology ring. Web translation software is being surpassed by portable, state-of-the-art technology in the form of earpieces, hand-held devices and apps, all of which are enabling users to quickly navigate our multilingual world on-the-go. Most recently, American Airlines announced it is testing interpreter mode for Google Assistant to help communication between their employees and travellers who speak a different language. In recent years, artificial intelligence (AI) has drastically enhanced the accuracy and quality of foreign language translations – allowing machines to help break down language barriers for customer service teams and tourists alike.

ai make language, language learning, translation device, (4 more...)

#artificialintelligence

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
Europe > Germany (0.05)
Europe > France (0.05)
Europe > Austria (0.05)

Industry:

Transportation > Passenger (0.56)
Transportation > Air (0.56)
Education > Curriculum > Subject-Specific Education (0.50)
Health & Medicine > Therapeutic Area (0.49)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.35)

Add feedback

BLEU might be Guilty but References are not Innocent

Freitag, Markus, Grangier, David, Caswell, Isaac

arXiv.org Artificial IntelligenceApr-13-2020

The quality of automatic metrics for machine translation has been increasingly called into question, especially for high-quality systems. This paper demonstrates that, while choice of metric is important, the nature of the references is also critical. We study different methods to collect references and compare their value in automated evaluation by reporting correlation with human evaluation for a variety of systems and metrics. Motivated by the finding that typical references exhibit poor diversity, concentrating around translationese language, we develop a paraphrasing task for linguists to perform on existing reference translations, which counteracts this bias. Our method yields higher correlation with human judgment not only for the submissions of WMT 2019 English to German, but also for Back-translation and APE augmented MT output, which have been shown to have low correlation with automatic metrics using standard references. We demonstrate that our methodology improves correlation with all modern evaluation metrics we look at, including embedding-based methods. To complete this picture, we reveal that multi-reference BLEU does not improve the correlation for high quality output, and present an alternative multi-reference formulation that is more effective.

computational linguistic, evaluation, translation, (13 more...)

arXiv.org Artificial Intelligence

2004.06063

Country:

Europe > Germany > Berlin (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(16 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

An In-depth Walkthrough on Evolution of Neural Machine Translation

Jagtap, Rohan, Dhage, Sudhir N.

arXiv.org Artificial IntelligenceApr-10-2020

Neural Machine Translation (NMT) methodologies have burgeoned from using simple feed-forward architectures to the state of the art; viz. BERT model. The use cases of NMT models have been broadened from just language translations to conversational agents (chatbots), abstractive text summarization, image captioning, etc. which have proved to be a gem in their respective applications. This paper aims to study the major trends in Neural Machine Translation, the state of the art models in the domain and a high level comparison between them.

architecture, arxiv e-print, machine translation, (13 more...)

arXiv.org Artificial Intelligence

2004.04902

Country:

North America > United States (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > India > Maharashtra > Mumbai (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Joint translation and unit conversion for end-to-end localization

Dinu, Georgiana, Mathur, Prashant, Federico, Marcello, Lauly, Stanislas, Al-Onaizan, Yaser

arXiv.org Artificial IntelligenceApr-10-2020

A variety of natural language tasks require processing of textual data which contains a mix of natural language and formal languages such as mathematical expressions. In this paper, we take unit conversions as an example and propose a data augmentation technique which leads to models learning both translation and conversion tasks as well as how to adequately switch between them for end-to-end localization.

computational linguistic, conversion, unit conversion, (13 more...)

arXiv.org Artificial Intelligence

2004.05219

Country:

Europe > Italy > Tuscany > Florence (0.04)
Europe > Germany > Berlin (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(9 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Architecture for a multilingual Wikipedia

Vrandečić, Denny

arXiv.org Artificial IntelligenceApr-8-2020

Wikipedia's vision is a world in which everyone can share in the sum of all knowledge. In its first two decades, this vision has been very unevenly achieved. One of the largest hindrances is the sheer number of languages Wikipedia needs to cover in order to achieve that goal. We argue that we need a new approach to tackle this problem more effectively, a multilingual Wikipedia where content can be shared between language editions. This paper proposes an architecture for a system that fulfills this goal. It separates the goal in two parts: creating and maintaining content in an abstract notation within a project called Abstract Wikipedia, and creating an infrastructure called Wikilambda that can translate this notation to natural language. Both parts are fully owned and maintained by the community, as is the integration of the results in the existing Wikipedia editions. This architecture will make more encyclopedic content available to more people in their own language, and at the same time allow more people to contribute knowledge and reach more people with their contributions, no matter what their respective language backgrounds. Additionally, Wikilambda will unlock a new type of knowledge asset people can share in through the Wikimedia projects, functions, which will vastly expand what people can do with knowledge from Wikimedia, and provide a new venue to collaborate and to engage the creativity of contributors from all around the world. These two projects will considerably expand the capabilities of the Wikimedia platform to enable every single human being to freely share in the sum of all knowledge.

abstract wikipedia, wikilambda, wikipedia, (16 more...)

arXiv.org Artificial Intelligence

2004.04733

Country:

North America > United States > California > San Francisco County > San Francisco (0.05)
North America > United States > California > Los Angeles County > Los Angeles (0.05)
North America > United States > California > San Diego County > San Diego (0.04)
(13 more...)

Genre: Research Report (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.67)

Add feedback

Applying Cyclical Learning Rate to Neural Machine Translation

Lee, Choon Meng, Liu, Jianfeng, Peng, Wei

arXiv.org Machine LearningApr-6-2020

In training deep learning networks, the optimizer and related learning rate are often used without much thought or with minimal tuning, even though it is crucial in ensuring a fast convergence to a good quality minimum of the loss function that can also generalize well on the test dataset. Drawing inspiration from the successful application of cyclical learning rate policy for computer vision related convolutional networks and datasets, we explore how cyclical learning rate can be applied to train transformer-based neural networks for neural machine translation. From our carefully designed experiments, we show that the choice of optimizers and the associated cyclical learning rate policy can have a significant impact on the performance. In addition, we establish guidelines when applying cyclical learning rates to neural machine translation tasks. Thus with our work, we hope to raise awareness of the importance of selecting the right optimizers and the accompanying learning rate policy, at the same time, encourage further research into easy-to-use learning rate policies.

batch size, clr, learning rate, (14 more...)

arXiv.org Machine Learning

2004.02401

Country:

Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)
Europe > Germany > Berlin (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neural Machine Translation with Imbalanced Classes

Gowda, Thamme, May, Jonathan

arXiv.org Machine LearningApr-5-2020

We cast neural machine translation (NMT) as a classification task in an autoregressive setting and analyze the limitations of both classification and autoregression components. Classifiers are known to perform better with balanced class distributions during training. Since the Zipfian nature of languages causes imbalanced classes, we explore the effect of class imbalance on NMT. We analyze the effect of vocabulary sizes on NMT performance and reveal an explanation for 'why' certain vocabulary sizes are better than others.

class imbalance, imbalance, vocabulary size, (15 more...)

arXiv.org Machine Learning

2004.02334

Country:

North America > United States > California (0.14)
Oceania > Australia (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(6 more...)

Genre: Research Report (0.64)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Machine Translation for User-Generated Content

#artificialintelligenceApr-3-2020, 23:29:34 GMT

A specific use case worth exploring in this regard is MT for User Generated Content (UGC). Because of the speed with which UGC (comments, feedback, reviews) is being created and the corresponding costs of its professional translation, many organizations turn to MT. Popular examples of such companies are Skype (in addition to text translation, Microsoft developed the Automatic Speech Recognition (ASR) for audio speech translation in Skype) and Facebook. The social network is aiming to solve the challenge of fine-tuning each system relating to a specific language pair, using neural machine translation (NMT) and benefiting from various contexts for translations. One solution that tackles this issue is the technology developed by Language I/O. It takes into account the client's glossaries and TMs, selects the best MT engine output and then improves on the results using cultural intelligence and/or human linguists who compare machine translations post-facto to ensure that their MT Optimizer engine learns over time.

machine translation, translation, user-generated content, (3 more...)

#artificialintelligence

Industry: Information Technology (0.59)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback