AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

WeChat Neural Machine Translation Systems for WMT20

Meng, Fandong, Yan, Jianhao, Liu, Yijin, Gao, Yuan, Zeng, Xianfeng, Zeng, Qinsong, Li, Peng, Chen, Ming, Zhou, Jie, Liu, Sifan, Zhou, Hao

arXiv.org Artificial IntelligenceOct-5-2020

We participate in the WMT 2020 shared news translation task on Chinese to English. Our system is based on the Transformer (Vaswani et al., 2017a) with effective variants and the DTMT (Meng and Zhang, 2019) architecture. In our experiments, we employ data selection, several synthetic data generation approaches (i.e., back-translation, knowledge distillation, and iterative in-domain knowledge transfer), advanced finetuning approaches and self-bleu based model ensemble. Our constrained Chinese to English system achieves 36.9 case-sensitive BLEU score, which is the highest among all submissions.

artificial intelligence, natural language, transformer, (15 more...)

arXiv.org Artificial Intelligence

2010.00247

Country:

Europe > Italy > Tuscany > Florence (0.05)
Europe > Germany > Berlin (0.05)
Asia > China > Hong Kong (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)

Genre: Research Report > New Finding (0.49)

Industry: Information Technology > Services (0.41)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness

Ng, Nathan, Cho, Kyunghyun, Ghassemi, Marzyeh

arXiv.org Machine LearningOct-4-2020

Models that perform well on a training domain often fail to generalize to out-of-domain (OOD) examples. Data augmentation is a common method used to prevent overfitting and improve OOD generalization. However, in natural language, it is difficult to generate new examples that stay on the underlying data manifold. We introduce SSMBA, a data augmentation method for generating synthetic training examples by using a pair of corruption and reconstruction functions to move randomly on a data manifold. We investigate the use of SSMBA in the natural language domain, leveraging the manifold assumption to reconstruct corrupted text with masked language models. In experiments on robustness benchmarks across 3 tasks and 9 datasets, SSMBA consistently outperforms existing data augmentation methods and baseline models on both in-domain and OOD data, achieving gains of 0.8% accuracy on OOD Amazon reviews, 1.8% accuracy on OOD MNLI, and 1.4 BLEU on in-domain IWSLT14 German-English.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2009.10195

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(14 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)

Add feedback

How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?

Longpre, Shayne, Wang, Yu, DuBois, Christopher

arXiv.org Artificial IntelligenceOct-4-2020

Task-agnostic forms of data augmentation have proven widely effective in computer vision, even on pretrained models. In NLP similar results are reported most commonly for low data regimes, non-pretrained models, or situationally for pretrained models. In this paper we ask how effective these techniques really are when applied to pretrained transformers. Using two popular varieties of task-agnostic data augmentation (not tailored to any particular task), Easy Data Augmentation (Wei and Zou, 2019) and Back-Translation (Sennrichet al., 2015), we conduct a systematic examination of their effects across 5 classification tasks, 6 datasets, and 3 variants of modern pretrained transformers, including BERT, XLNet, and RoBERTa. We observe a negative result, finding that techniques which previously reported strong improvements for non-pretrained models fail to consistently improve performance for pretrained transformers, even when training data is limited. We hope this empirical analysis helps inform practitioners where data augmentation techniques may confer improvements.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2010.01764

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.47)

Add feedback

Deep learning to translate between programming languages

#artificialintelligenceOct-2-2020, 03:07:35 GMT

Migrating a codebase from an archaic programming language such as COBOL to a modern alternative like Java or C is a difficult, resource-intensive task that requires expertise in both the source and target languages. COBOL, for example, is still widely used today in mainframe systems around the world, so companies, governments, and others often must choose whether to manually translate their code bases or commit to maintaining code written in a language that dates back to the 1950s. We've developed and open sourced TransCoder, an entirely self-supervised neural transcompiler system that can make code migration far easier and more efficient. Our method is the first AI system able to translate code from one programming language to another without requiring parallel data for training. We've demonstrated that TransCoder can successfully translate functions between C, Java, and Python 3. TransCoder outperforms open source and commercial rule-based translation programs.

machine learning, natural language, programming language, (18 more...)

#artificialintelligence

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Data Transfer Approaches to Improve Seq-to-Seq Retrosynthesis

Ishiguro, Katsuhiko, Ujihara, Kazuya, Sawada, Ryohto, Akita, Hirotaka, Kotera, Masaaki

arXiv.org Machine LearningOct-2-2020

Retrosynthesis is a problem to infer reactant compounds to synthesize a given product compound through chemical reactions. Recent studies on retrosynthesis focus on proposing more sophisticated prediction models, but the dataset to feed the models also plays an essential role in achieving the best generalizing models. Generally, a dataset that is best suited for a specific task tends to be small. In such a case, it is the standard solution to transfer knowledge from a large or clean dataset in the same domain. In this paper, we conduct a systematic and intensive examination of data transfer approaches on end-to-end generative models, in application to retrosynthesis. Experimental results show that typical data transfer methods can improve test prediction scores of an off-the-shelf Transformer baseline model. Especially, the pre-training plus fine-tuning approach boosts the accuracy scores of the baseline, achieving the new state-of-the-art. In addition, we conduct a manual inspection for the erroneous prediction results. The inspection shows that the pre-training plus fine-tuning models can generate chemically appropriate or sensible proposals in almost all cases.

dataset, deep learning, neural network, (24 more...)

arXiv.org Machine Learning

2010.00792

Country:

Europe (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report > New Finding (0.49)

Industry:

Energy > Oil & Gas (0.67)
Materials > Chemicals (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

Neural Machine Translation: A Review

Stahlberg, Felix

Journal of Artificial Intelligence ResearchOct-2-2020

The field of machine translation (MT), the automatic translation of written text from one natural language into another, has experienced a major paradigm shift in recent years. Statistical MT, which mainly relies on various count-based models and which used to dominate MT research for decades, has largely been superseded by neural machine translation (NMT), which tackles translation with a single neural network. In this work we will trace back the origins of modern NMT architectures to word and sentence embeddings and earlier examples of the encoder-decoder network family. We will conclude with a short survey of more recent trends in the field.

computational linguistic, machine learning, natural language, (12 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12007

AI Access Foundation

12007

Journal of Artificial Intelligence Research

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.14)
(42 more...)

Genre: Overview (1.00)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Autoregressive Entity Retrieval

De Cao, Nicola, Izacard, Gautier, Riedel, Sebastian, Petroni, Fabio

arXiv.org Machine LearningOct-2-2020

Entities are at the center of how we represent and aggregate knowledge. For instance, Encyclopedias such as Wikipedia are structured by entities (e.g., one per article). The ability to retrieve such entities given a query is fundamental for knowledge-intensive tasks such as entity linking and open-domain question answering. One way to understand current approaches is as classifiers among atomic labels, one for each entity. Their weight vectors are dense entity representations produced by encoding entity information such as descriptions. This approach leads to several shortcomings: i) context and entity affinity is mainly captured through a vector dot product, potentially missing fine-grained interactions between the two; ii) a large memory footprint is needed to store dense representations when considering large entity sets; iii) an appropriately hard set of negative data has to be subsampled at training time. We propose GENRE, the first system that retrieves entities by generating their unique names, left to right, token-by-token in an autoregressive fashion, and conditioned on the context. This enables to mitigate the aforementioned technical issues: i) the autoregressive formulation allows us to directly capture relations between context and entity name, effectively cross encoding both; ii) the memory footprint is greatly reduced because the parameters of our encoder-decoder architecture scale with vocabulary size, not entity count; iii) the exact softmax loss can be efficiently computed without the need to subsample negative data. We show the efficacy of the approach with more than 20 datasets on entity disambiguation, end-to-end entity linking and document retrieval tasks, achieving new SOTA, or very competitive results while using a tiny fraction of the memory of competing systems. Finally, we demonstrate that new entities can be added by simply specifying their unambiguous name.

computational linguistic, large language model, machine learning, (22 more...)

arXiv.org Machine Learning

2010.00904

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Spain > Galicia > Madrid (0.05)
Europe > Italy > Tuscany > Florence (0.04)
(14 more...)

Genre: Research Report (0.70)

Industry:

Leisure & Entertainment (0.47)
Transportation > Ground > Rail (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)
(4 more...)

Add feedback

A survey on natural language processing (nlp) and applications in insurance

Ly, Antoine, Uthayasooriyar, Benno, Wang, Tingting

arXiv.org Machine LearningOct-1-2020

Text is the most widely used means of communication today. This data is abundant but nevertheless complex to exploit within algorithms. For years, scientists have been trying to implement different techniques that enable computers to replicate some mechanisms of human reading. During the past five years, research disrupted the capacity of the algorithms to unleash the value of text data. It brings today, many opportunities for the insurance industry.Understanding those methods and, above all, knowing how to apply them is a major challenge and key to unleash the value of text data that have been stored for many years. Processing language with computer brings many new opportunities especially in the insurance sector where reports are central in the information used by insurers. SCOR's Data Analytics team has been working on the implementation of innovative tools or products that enable the use of the latest research on text analysis. Understanding text mining techniques in insurance enhances the monitoring of the underwritten risks and many processes that finally benefit policyholders.This article proposes to explain opportunities that Natural Language Processing (NLP) are providing to insurance. It details different methods used today in practice traces back the story of them. We also illustrate the implementation of certain methods using open source libraries and python codes that we have developed to facilitate the use of these techniques.After giving a general overview on the evolution of text mining during the past few years,we share about how to conduct a full study with text mining and share some examples to serve those models into insurance products or services. Finally, we explained in more details every step that composes a Natural Language Processing study to ensure the reader can have a deep understanding on the implementation.

ctober 2, information, representation, (15 more...)

arXiv.org Machine Learning

2010.00462

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre:

Research Report (0.70)
Overview (0.46)

Industry: Banking & Finance > Insurance (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
Information Technology > Data Science > Data Mining > Text Mining (0.75)
(2 more...)

Add feedback

Does Machine Translation Affect International Trade? Evidence from a Large Digital Platform

#artificialintelligenceSep-27-2020, 17:35:34 GMT

Artificial intelligence (AI) is surpassing human performance in a growing number of domains. However, there is limited evidence of its economic effects. Using data from a digital platform, we study a key application of AI: machine translation. We find that the introduction of a new machine translation system has significantly increased international trade on this platform, increasing exports by 10.9%. Furthermore, heterogeneous treatment effects are consistent with a substantial reduction in translation costs.

international trade, machine translation, natural language, (2 more...)

#artificialintelligence

Industry:

Government > Foreign Policy (0.69)
Government > Commerce (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Hierarchical Pre-training for Sequence Labelling in Spoken Dialog

Chapuis, Emile, Colombo, Pierre, Manica, Matteo, Labeau, Matthieu, Clavel, Chloe

arXiv.org Artificial IntelligenceSep-23-2020

Sequence labelling tasks like Dialog Act and Emotion/Sentiment identification are a key component of spoken dialog systems. In this work, we propose a new approach to learn generic representations adapted to spoken dialog, which we evaluate on a new benchmark we call Sequence labellIng evaLuatIon benChmark fOr spoken laNguagE benchmark (\texttt{SILICONE}). \texttt{SILICONE} is model-agnostic and contains 10 different datasets of various sizes. We obtain our representations with a hierarchical encoder based on transformer architectures, for which we extend two well-known pre-training objectives. Pre-training is performed on OpenSubtitles: a large corpus of spoken dialog containing over $2.3$ billion of tokens. We demonstrate how hierarchical encoders achieve competitive results with consistently fewer parameters compared to state-of-the-art models and we show their importance for both pre-training and fine-tuning.

latexit sha1, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2009.11152

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)

Add feedback