AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Aligned Cross Entropy for Non-Autoregressive Machine Translation

Ghazvininejad, Marjan, Karpukhin, Vladimir, Zettlemoyer, Luke, Levy, Omer

arXiv.org Machine LearningApr-3-2020

Non-autoregressive machine translation models significantly speed up decoding by allowing for parallel prediction of the entire target sequence. However, modeling word order is more challenging due to the lack of autoregressive factors in the model. This difficultly is compounded during training with cross entropy loss, which can highly penalize small shifts in word order. In this paper, we propose aligned cross entropy (AXE) as an alternative loss function for training of non-autoregressive models. AXE uses a differentiable dynamic program to assign loss based on the best possible monotonic alignment between target tokens and model predictions. AXE-based training of conditional masked language models (CMLMs) substantially improves performance on major WMT benchmarks, while setting a new state of the art for non-autoregressive models.

prediction, sequence, translation, (11 more...)

arXiv.org Machine Learning

2004.01655

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

A Set of Recommendations for Assessing Human-Machine Parity in Language Translation

Läubli, Samuel, Castilho, Sheila, Neubig, Graham, Sennrich, Rico, Shen, Qinlan, Toral, Antonio

arXiv.org Artificial IntelligenceApr-3-2020

The quality of machine translation has increased remarkably over the past years, to the degree that it was found to be indistinguishable from professional human translation in a number of empirical investigations. We reassess Hassan et al.'s 2018 investigation into Chinese to English news translation, showing that the finding of human-machine parity was owed to weaknesses in the evaluation design - which is currently considered best practice in the field. We show that the professional human translations contained significantly fewer errors, and that perceived quality in human evaluation depends on the choice of raters, the availability of linguistic context, and the creation of reference translations. Our results call for revisiting current best practices to assess strong machine translation systems in general and human-machine parity in particular, for which we offer a set of recommendations based on our empirical findings.

evaluation, human translation, translation, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.1.11371

2004.01694

Country:

Asia > China > Hong Kong (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > China > Shandong Province > Qingdao (0.04)
(19 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

A Clustering Framework for Lexical Normalization of Roman Urdu

Khan, Abdul Rafae, Karim, Asim, Sajjad, Hassan, Kamiran, Faisal, Xu, Jia

arXiv.org Artificial IntelligenceMar-31-2020

Roman Urdu is an informal form of the Urdu language written in Roman script, which is widely used in South Asia for online textual content. It lacks standard spelling and hence poses several normalization challenges during automatic language processing. In this article, we present a feature-based clustering framework for the lexical normalization of Roman Urdu corpora, which includes a phonetic algorithm UrduPhone, a string matching component, a feature-based similarity function, and a clustering algorithm Lex-Var. UrduPhone encodes Roman Urdu strings to their pronunciation-based representations. The string matching component handles character-level variations that occur when writing Urdu using Roman script.

dataset, normalization, variation, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1017/S1351324920000285

2004.00088

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Africa > Middle East > Egypt > Giza Governorate > Giza (0.05)
(32 more...)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

On the Integration of LinguisticFeatures into Statistical and Neural Machine Translation

Vanmassenhove, Eva

arXiv.org Artificial IntelligenceMar-31-2020

New machine translations (MT) technologies are emerging rapidly and with them, bold claims of achieving human parity such as: (i) the results produced approach "accuracy achieved by average bilingual human translators" (Wu et al., 2017b) or (ii) the "translation quality is at human parity when compared to professional human translators" (Hassan et al., 2018) have seen the light of day (Laubli et al., 2018). Aside from the fact that many of these papers craft their own definition of human parity, these sensational claims are often not supported by a complete analysis of all aspects involved in translation. Establishing the discrepancies between the strengths of statistical approaches to MT and the way humans translate has been the starting point of our research. By looking at MT output and linguistic theory, we were able to identify some remaining issues. The problems range from simple number and gender agreement errors to more complex phenomena such as the correct translation of aspectual values and tenses. Our experiments confirm, along with other studies (Bentivogli et al., 2016), that neural MT has surpassed statistical MT in many aspects. However, some problems remain and others have emerged. We cover a series of problems related to the integration of specific linguistic features into statistical and neural MT, aiming to analyse and provide a solution to some of them. Our work focuses on addressing three main research questions that revolve around the complex relationship between linguistics and MT in general. We identify linguistic information that is lacking in order for automatic translation systems to produce more accurate translations and integrate additional features into the existing pipelines. We identify overgeneralization or 'algorithmic bias' as a potential drawback of neural MT and link it to many of the remaining linguistic issues.

accumulated frequency difference, exacerbation and decay count, plural subjunctive mood, (16 more...)

arXiv.org Artificial Intelligence

2003.14324

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
(79 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education (0.92)
Government (0.92)
Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

29 Cutting Edge Applications of Artificial Intelligence - The Burnie Group

#artificialintelligenceMar-24-2020, 13:37:46 GMT

Artificial Intelligence (AI) is the theory and development of computer systems that can perform tasks that normally require human intelligence. These tasks include visual perception, speech recognition, decision making, and language translation. Systems capable of performing such tasks are steadily transitioning from research laboratories into industry usage. AI technology is unique in that it is flexible in application. It can be used to improve processes, enhance interactions, and solve problems that, until recently, could only be performed by humans.

application, artificial intelligence, perception, (9 more...)

#artificialintelligence

Genre: Overview > Innovation (0.43)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.59)

Add feedback

On Interactive Machine Learning and the Potential of Cognitive Feedback

Michael, Chris J., Acklin, Dina, Scheuerman, Jaelle

arXiv.org Artificial IntelligenceMar-23-2020

In order to increase productivity, capability, and data exploitation, numerous defense applications are experiencing an integration of state-of-the-art machine learning and AI into their architectures. Especially for defense applications, having a human analyst in the loop is of high interest due to quality control, accountability, and complex subject matter expertise not readily automated or replicated by AI. However, many applications are suffering from a very slow transition. This may be in large part due to lack of trust, usability, and productivity, especially when adapting to unforeseen classes and changes in mission context. Interactive machine learning is a newly emerging field in which machine learning implementations are trained, optimized, evaluated, and exploited through an intuitive human-computer interface. In this paper, we introduce interactive machine learning and explain its advantages and limitations within the context of defense applications. Furthermore, we address several of the shortcomings of interactive machine learning by discussing how cognitive feedback may inform features, data, and results in the state of the art. We define the three techniques by which cognitive feedback may be employed: self reporting, implicit cognitive feedback, and modeled cognitive feedback. The advantages and disadvantages of each technique are discussed.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2003.10365

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Industry:

Government > Military (1.00)
Health & Medicine > Therapeutic Area (0.68)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.47)

Add feedback

A Set of Recommendations for Assessing Human–Machine Parity in Language Translation

Journal of Artificial Intelligence ResearchMar-23-2020

The quality of machine translation has increased remarkably over the past years, to the degree that it was found to be indistinguishable from professional human translation in a number of empirical investigations. We reassess Hassan et al.'s 2018 investigation into Chinese to English news translation, showing that the finding of human-machine parity was owed to weaknesses in the evaluation design--which is currently considered best practice in the field. We show that the professional human translations contained significantly fewer errors, and that perceived quality in human evaluation depends on the choice of raters, the availability of linguistic context, and the creation of reference translations. Our results call for revisiting current best practices to assess strong machine translation systems in general and human-machine parity in particular, for which we offer a set of recommendations based on our empirical findings.

artificial intelligence, natural language, translation, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11371

AI Access Foundation

11371

Journal of Artificial Intelligence Research

Country:

Asia > China > Hong Kong (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > China > Shandong Province > Qingdao (0.04)
(19 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Google's New AI Transcribe Feature - A Translating Wiz

#artificialintelligenceMar-22-2020, 19:12:54 GMT

Is the improved transcription feature the new replacement of the earlier Google Live Transcribe? The latest audio-to-text translation service is out and about, but only for Android users for the time being. Record the audio in one language and have it rendered in another language altogether! Lengthy discussions can be easily transcribed into text now, without any trouble. January marked the launch of the AI-Powered transcription feature of Google Translate on Android, and now it supports transcribed translations between any of the eight languages, including French, German, Portuguese, English, Thai, Hindi, Spanish, Russian.

google, transcribe feature, transcription feature, (9 more...)

#artificialintelligence

Technology:

Information Technology > Communications > Mobile (0.79)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.43)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.34)

Add feedback

Dead Languages Come to Life

Communications of the ACMMar-21-2020, 13:02:15 GMT

Driven by advanced techniques in machine learning, commercial systems for automated language translation now nearly match the performance of human linguists, and far more efficiently. Google Translate supports 105 languages, from Afrikaans to Zulu, and in addition to printed text it can translate speech, handwriting, and the text found on websites and in images. The methods for doing those things are clever, but the key enabler lies in the huge annotated databases of writings in the various language pairs. A translation from French to English succeeds because the algorithms were trained on millions of actual translation examples. The expectation is that every word or phrase that comes into the system, with its associated rules and patterns of language structure, will have been seen and translated before.

barzilay, decipherment, translation, (15 more...)

Communications of the ACM

Country:

North America > United States > Virginia > Arlington County > Arlington (0.05)
North America > United States > Massachusetts (0.05)
North America > United States > California > San Diego County > San Diego (0.05)
(2 more...)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.30)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Translating documents with Amazon Translate, AWS Lambda, and the new Batch Translate API Amazon Web Services

#artificialintelligenceMar-20-2020, 23:30:12 GMT

With an increasing number of digital text documents shared across the world for both business and personal reasons, the need for translation capabilities becomes even more critical. There are multiple tools available online that enable people to copy/paste text and get the translated equivalent in the language of their choice. While this is a great way to perform ad hoc translation of a (limited) amount of text, it can be tedious and time-consuming if performed frequently. Your organization may largely depend on content to document your products and services, teach your customers how to interact with you, or just share the cool things you are doing. This content is often text-heavy and mostly written in English.

amazon translate, aw lambda, translation, (11 more...)

#artificialintelligence

Genre: Instructional Material (0.56)

Industry:

Retail > Online (0.40)
Information Technology > Services (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.53)

Add feedback