AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Adversarial attacks against Fact Extraction and VERification

Thorne, James, Vlachos, Andreas

arXiv.org Artificial IntelligenceMar-13-2019

This paper describes a baseline for the second iteration of the Fact Extraction and VERification shared task (FEVER2.0) which explores the resilience of systems through adversarial evaluation. We present a collection of simple adversarial attacks against systems that participated in the first FEVER shared task. FEVER modeled the assessment of truthfulness of written claims as a joint information retrieval and natural language inference task using evidence from Wikipedia. A large number of participants made use of deep neural networks in their submissions to the shared task. The extent as to whether such models understand language has been the subject of a number of recent investigations and discussion in literature. In this paper, we present a simple method of generating entailment-preserving and entailment-altering perturbations of instances by common patterns within the training data. We find that a number of systems are greatly affected with absolute losses in classification accuracy of up to $29\%$ on the newly perturbed instances. Using these newly generated instances, we construct a sample submission for the FEVER2.0 shared task. Addressing these types of attacks will aid in building more robust fact-checking models, as well as suggest directions to expand the datasets.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

1903.05543

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
(4 more...)

Genre: Research Report (0.66)

Industry:

Leisure & Entertainment (0.94)
Information Technology > Security & Privacy (0.61)
Government > Military (0.61)
Media > Film (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Artificial intelligence: AI is changing all the tech products around us

#artificialintelligenceMar-10-2019, 17:28:39 GMT

The world's biggest consumer electronics show was held last month and wandering around the seemingly endless stalls of emerging new products, it was impossible to avoid the claims of artificial intelligence in some form or another. Some gadgets were, of course, smarter than others. From facial recognition food bowls for your pets to handheld speech recognition and language translation devices, smart tech and self-learning algorithms abound. The actual intelligence of some smart products is debatable but the trend is undeniable.Source:Supplied Encompassing terms including deep learning, machine learning, neural networks and general artificial intelligence which seeks to build computers with a capacity to think and learn like humans, it can be hard to pin down what AI truly means. But it's clearly here to stay.

artificial intelligence, machine learning, natural language, (17 more...)

#artificialintelligence

Industry:

Information Technology (1.00)
Law > Intellectual Property & Technology Law (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.58)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.37)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.36)

Add feedback

Integrating Artificial and Human Intelligence for Efficient Translation

Herbig, Nico, Pal, Santanu, van Genabith, Josef, Krüger, Antonio

arXiv.org Artificial IntelligenceMar-7-2019

It has been shown that PE can not only yield productivity gains of 36% [9], but that it also increases the quality [2]. This paper discusses how human and artificial intelligence can be combined for efficient language translations, which tools already exist and which open challenges remain (see Figure 1). HARNESSING SYNERGIES BETWEEN AIS AND HUMANS Draft Proposal The PE process starts with an initial draft that is proposed by the AI and which the human uses as a basis. There are two main sources for this proposal: a machine translation (MT) and a translation memory (TM). Simply put, TMs are large databases containing already completed human translations which are matched (using fuzzy or exact matches) against the sentence to be translated to provide a starting point for PE. Machines can easily generate a variety of probable translations from (a combination of) MT and TM instead of only a single one; however, proposing too many and maybe even highly similar translations could overwhelm the human.

artificial intelligence, natural language, translation, (8 more...)

arXiv.org Artificial Intelligence

1903.02978

Country: Europe > Germany > Saarland (0.05)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Detecting Overfitting via Adversarial Examples

Werpachowski, Roman, György, András, Szepesvári, Csaba

arXiv.org Machine LearningMar-6-2019

The repeated reuse of test sets in popular benchmark problems raises doubts about the credibility of reported test error rates. Verifying whether a learned model is overfitted to a test set is challenging as independent test sets drawn from the same data distribution are usually unavailable, while other test sets may introduce a distribution shift. We propose a new hypothesis test that uses only the original test data to detect overfitting. It utilizes a new unbiased error estimate that is based on adversarial examples generated from the test data and importance weighting. Overfitting is detected if this error estimate is sufficiently different from the original test error rate. The power of the method is illustrated using Monte Carlo simulations on a synthetic problem. We develop a specialized variant of our dependence detector for multiclass image classification, and apply it to testing overfitting of recent models to two popular real-world image classification benchmarks. In the case of ImageNet, our method was not able to detect overfitting to the test set for a state-of-the-art classifier, while on CIFAR-10 we found strong evidence of overfitting for the two recent model architectures we considered, and weak evidence of overfitting on the level of individual training runs.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1903.0238

Country:

North America > United States (1.00)
North America > Canada > Alberta (0.28)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.50)
Research Report > Promising Solution (0.45)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

OpenKiwi: An Open Source Framework for Quality Estimation

#artificialintelligenceMar-5-2019, 10:45:16 GMT

A year ago we told you why Quality Estimation is the missing piece in Machine Translation. Today, we have some exciting news to share about a new project from our AI Research team, with my colleagues Fábio Kepler, Sony Trénous, and Miguel Vera. Since 2016, Unbabel's AI team has been focused on advancing the state of the art in Quality Estimation (QE). Our models are running in production systems for 14 language pairs, with coverage and performance improving over time, thanks to the increasing amount of data produced by our human post-editors on a daily basis. This combination of AI and humans is what makes our translation pipeline fast and accurate, at scale.

artificial intelligence, natural language, quality estimation, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.48)

Add feedback

What Did We Learn at the New Work Summit?

#artificialintelligenceMar-3-2019, 17:25:17 GMT

MR. METZ This is an ongoing problem. There have been very real and very significant gains in image recognition, speech recognition and language translation over the last several years. That can help with talking digital assistants, driverless cars and certain aspects of health care -- not to mention face recognition services and autonomous weapons. Driverless cars are still years from the mainstream. Better translation is very different from a more general intelligence that can do anything a human can do.

artificial intelligence, machine translation, natural language, (6 more...)

#artificialintelligence

Industry: Information Technology > Smart Houses & Appliances (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.57)

Add feedback

How machine learning can be used to break down language barriers

#artificialintelligenceMar-2-2019, 04:28:49 GMT

Machine learning has transformed major aspects of the modern world with great success. Self-driving cars, intelligent virtual assistants on smartphones, and cybersecurity automation are all examples of how far the technology has come. But of all the applications of machine learning, few have the potential to so radically shape our economy as language translation. The content of language translation is the perfect model for machine learning to tackle. Language operates on a set of predictable rules, but with a degree of variation that makes it difficult for humans to interpret.

machine learning, natural language, translation, (16 more...)

#artificialintelligence

Country:

Europe > Switzerland > Zürich > Zürich (0.05)
Asia > China (0.05)

Industry: Information Technology > Security & Privacy (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Calibration of Encoder Decoder Models for Neural Machine Translation

Kumar, Aviral, Sarawagi, Sunita

arXiv.org Machine LearningMar-2-2019

We study the calibration of several state of the art neural machine translation(NMT) systems built on attention-based encoder-decoder models. For structured outputs like in NMT, calibration is important not just for reliable confidence with predictions, but also for proper functioning of beam-search inference. We show that most modern NMT models are surprisingly miscalibrated even when conditioned on the true previous tokens. Our investigation leads to two main reasons -- severe miscalibration of EOS (end of sequence marker) and suppression of attention uncertainty. We design recalibration methods based on these signals and demonstrate improved accuracy, better sequence-level calibration, and more intuitive results from beam-search.

calibration, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1903.00802

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Jointly Optimizing Diversity and Relevance in Neural Response Generation

Gao, Xiang, Lee, Sungjin, Zhang, Yizhe, Brockett, Chris, Galley, Michel, Gao, Jianfeng, Dolan, Bill

arXiv.org Artificial IntelligenceFeb-28-2019

Although recent neural conversation models have shown great potential, they often generate bland and generic responses. While various approaches have been explored to diversify the output of the conversation model, the improvement often comes at the cost of decreased relevance. In this paper, we propose a method to jointly optimize diversity and relevance that essentially fuses the latent space of a sequence-to-sequence model and that of an autoencoder model by leveraging novel regularization terms. As a result, our approach induces a latent space in which the distance and direction from the predicted response vector roughly match the relevance and diversity, respectively. This property also lends itself well to an intuitive visualization of the latent space. Both automatic and human evaluation results demonstrate that the proposed approach brings significant improvement compared to strong baselines in both diversity and relevance.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

1902.11205

Country:

North America > United States > Washington > King County > Redmond (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.70)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.40)
Media (0.31)
Leisure & Entertainment (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Non-Parametric Adaptation for Neural Machine Translation

Bapna, Ankur, Firat, Orhan

arXiv.org Machine LearningFeb-28-2019

Neural Networks trained with gradient descent are known to be susceptible to catastrophic forgetting caused by parameter shift during the training process. In the context of Neural Machine Translation (NMT) this results in poor performance on heterogeneous datasets and on sub-tasks like rare phrase translation. On the other hand, non-parametric approaches are immune to forgetting, perfectly complementing the generalization ability of NMT. However, attempts to combine non-parametric or retrieval based approaches with NMT have only been successful on narrow domains, possibly due to over-reliance on sentence level retrieval. We propose a novel n-gram level retrieval approach that relies on local phrase level similarities, allowing us to retrieve neighbors that are useful for translation even when overall sentence similarity is low. We complement this with an expressive neural network, allowing our model to extract information from the noisy retrieved context. We evaluate our semi-parametric NMT approach on a heterogeneous dataset composed of WMT, IWSLT, JRC-Acquis and OpenSubtitles, and demonstrate gains on all 4 evaluation sets. The semi-parametric nature of our approach opens the door for non-parametric domain adaptation, demonstrating strong inference-time adaptation performance on new domains without the need for any parameter updates.

machine learning, natural language, translation, (15 more...)

arXiv.org Machine Learning

1903.00058

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback