AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Training Deeper Neural Machine Translation Models with Transparent Attention

Bapna, Ankur, Chen, Mia Xu, Firat, Orhan, Cao, Yuan, Wu, Yonghui

arXiv.org Artificial IntelligenceSep-4-2018

While current state-of-the-art NMT models, such as RNN seq2seq and Transformers, possess a large number of parameters, they are still shallow in comparison to convolutional models used for both text and vision applications. In this work we attempt to train significantly (2-3x) deeper Transformer and Bi-RNN encoders for machine translation. We propose a simple modification to the attention mechanism that eases the optimization of deeper models, and results in consistent gains of 0.7-1.1 BLEU on the benchmark WMT'14 English-German and WMT'15 Czech-English tasks for both architectures.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

1808.07561

Country: Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Unsupervised Statistical Machine Translation

Artetxe, Mikel, Labaka, Gorka, Agirre, Eneko

arXiv.org Artificial IntelligenceSep-4-2018

While modern machine translation has relied on large parallel corpora, a recent line of work has managed to train Neural Machine Translation (NMT) systems from monolingual corpora only (Artetxe et al., 2018c; Lample et al., 2018). Despite the potential of this approach for low-resource settings, existing systems are far behind their supervised counterparts, limiting their practical interest. In this paper, we propose an alternative approach based on phrase-based Statistical Machine Translation (SMT) that significantly closes the gap with supervised systems. Our method profits from the modular architecture of SMT: we first induce a phrase table from monolingual corpora through cross-lingual embedding mappings, combine it with an n-gram language model, and fine-tune hyperparameters through an unsupervised MERT variant. In addition, iterative backtranslation improves results further, yielding, for instance, 14.08 and 26.22 BLEU points in WMT 2014 English-German and English-French, respectively, an improvement of more than 7-10 BLEU points over previous unsupervised systems, and closing the gap with supervised SMT (Moses trained on Europarl) down to 2-5 BLEU points. Our implementation is available at https:// github.com/artetxem/monoses.

artificial intelligence, natural language, translation, (16 more...)

arXiv.org Artificial Intelligence

1809.01272

Country:

Oceania > Australia (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(12 more...)

Genre: Research Report > New Finding (0.46)

Industry: Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Pointwise HSIC: A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions

Yokoi, Sho, Kobayashi, Sosuke, Fukumizu, Kenji, Suzuki, Jun, Inui, Kentaro

arXiv.org Machine LearningSep-4-2018

In this paper, we propose a new kernel-based co-occurrence measure that can be applied to sparse linguistic expressions (e.g., sentences) with a very short learning time, as an alternative to pointwise mutual information (PMI). As well as deriving PMI from mutual information, we derive this new measure from the Hilbert--Schmidt independence criterion (HSIC); thus, we call the new measure the pointwise HSIC (PHSIC). PHSIC can be interpreted as a smoothed variant of PMI that allows various similarity metrics (e.g., sentence embeddings) to be plugged in as kernels. Moreover, PHSIC can be estimated by simple and fast (linear in the size of the data) matrix calculations regardless of whether we use linear or nonlinear kernels. Empirically, in a dialogue response selection task, PHSIC is learned thousands of times faster than an RNN-based PMI while outperforming PMI in accuracy. In addition, we also demonstrate that PHSIC is beneficial as a criterion of a data selection task for machine translation owing to its ability to give high (low) scores to a consistent (inconsistent) pair with other pairs.

kernel, phsic, pmi, (13 more...)

arXiv.org Machine Learning

doi: 10.18653/v1/D18-1203

1809.008

Country:

Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Tōhoku (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Approximate Distribution Matching for Sequence-to-Sequence Learning

Chen, Wenhu, Li, Guanlin, Liu, Shujie, Zhang, Zhirui, Li, Mu, Zhou, Ming

arXiv.org Artificial IntelligenceSep-2-2018

Sequence-to-Sequence models were introduced to tackle many real-life problems like machine translation, summarization, image captioning, etc. The standard optimization algorithms are mainly based on example-to-example matching like maximum likelihood estimation, which is known to suffer from data sparsity problem. Here we present an alternate view to explain sequence-to-sequence learning as a distribution matching problem, where each source or target example is viewed to represent a local latent distribution in the source or target domain. Then, we interpret sequence-to-sequence learning as learning a transductive model to transform the source local latent distributions to match their corresponding target distributions. In our framework, we approximate both the source and target latent distributions with recurrent neural networks (augmenter). During training, the parallel augmenters learn to better approximate the local latent distributions, while the sequence prediction model learns to minimize the KL-divergence of the transformed source distributions and the approximated target distributions. This algorithm can alleviate the data sparsity issues in sequence learning by locally augmenting more unseen data pairs and increasing the model's robustness. Experiments conducted on machine translation and image captioning consistently demonstrate the superiority of our proposed algorithm over the other competing algorithms.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

1808.08003

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)

Add feedback

Chinese Pinyin Aided IME, Input What You Have Not Keystroked Yet

Huang, Yafang, Zhao, Hai

arXiv.org Artificial IntelligenceSep-2-2018

Chinese pinyin input method engine (IME) converts pinyin into character so that Chinese characters can be conveniently inputted into computer through common keyboard. IMEs work relying on its core component, pinyin-to-character conversion (P2C). Usually Chinese IMEs simply predict a list of character sequences for user choice only according to user pinyin input at each turn. However, Chinese inputting is a multi-turn online procedure, which can be supposed to be exploited for further user experience promoting. This paper thus for the first time introduces a sequence-to-sequence model with gated-attention mechanism for the core task in IMEs. The proposed neural P2C model is learned by encoding previous input utterance as extra context to enable our IME capable of predicting character sequence with incomplete pinyin input. Our model is evaluated in different benchmark datasets showing great user experience improvement compared to traditional models, which demonstrates the first engineering practice of building Chinese aided IME.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

1809.00329

Country: Asia > China > Shanghai > Shanghai (0.05)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.73)

Add feedback

Learning to Represent Bilingual Dictionaries

Chen, Muhao, Tian, Yingtao, Chen, Haochen, Chang, Kai-Wei, Skiena, Steven, Zaniolo, Carlo

arXiv.org Artificial IntelligenceAug-31-2018

Bilingual word embeddings have been widely used to capture the correspondence of lexical semantics in different human languages. However, the cross-lingual correspondence between sentences and lexicons is less studied, despite that this correspondence can largely benefit many applications, such as cross-lingual semantic search and question answering. To bridge this gap, we propose a neural embedding model that leverages bilingual dictionaries. The proposed model is trained to map the literal word definitions to the cross-lingual target words, for which we explore with different sentence encoding techniques. To enhance the learning process on limited resources, our model adopts several critical learning strategies, including multi-task learning on different bridges of languages, and joint learning of the dictionary model with a bilingual word embedding model. We conduct experiments on two tasks: (i) cross-lingual reverse dictionary retrieval, and (ii) bilingual paraphrase identification. In the former task, we demonstrate that our model is capable of comprehending bilingual concepts based on descriptions, and we also highlight the effectiveness of proposed learning strategies. In the latter one, we show that the proposed model effectively associates sentences in different languages via a shared embedding space, and outperforms existing approaches in identifying bilingual paraphrases.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

1808.03726

Country: North America > United States > New York > Suffolk County > Stony Brook (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.95)

Add feedback

Another AI winter could usher in a dark period for artificial intelligence

Humans have been pondering the potential of artificial intelligence for thousands of years. Ancient Greeks believed, for example, that a bronze automaton named Talos protected the island of Crete from maritime adversaries. But AI only moved from the mythical realm to the real world in the last half-century, beginning with legendary computer scientist Alan Turing's foundational 1950 essay asked and provided a framework for answering the provocative question, "Can machines think?" At that time, the United States was in the midst of the Cold War. Congressional representatives decided to invest heavily in artificial intelligence as part of a larger security strategy.

artificial intelligence, machine learning, natural language, (9 more...)

Popular Science

Country: North America > United States (0.55)

Industry:

Information Technology > Security & Privacy (0.93)
Government > Military (0.75)

Technology:

Information Technology > Artificial Intelligence > History (0.73)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.37)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.35)

Add feedback

Learning Multilingual Word Embeddings in Latent Metric Space: A Geometric Approach

Jawanpuria, Pratik, Balgovind, Arjun, Kunchukuttan, Anoop, Mishra, Bamdev

arXiv.org Artificial IntelligenceAug-28-2018

We propose a novel geometric approach for learning bilingual mappings given monolingual embeddings and a bilingual dictionary. Our approach decouples learning the transformation from the source language to the target language into (a) learning rotations for language-specific embeddings to align them to a common space, and (b) learning a similarity metric in the common space to model similarities between the embeddings. We model the bilingual mapping problem as an optimization problem on smooth Riemannian manifolds. We show that our approach outperforms previous approaches on the bilingual lexicon induction and cross-lingual word similarity tasks. We also generalize our framework to represent multiple languages in a common latent space. In particular, the latent space representations for several languages are learned jointly, given bilingual dictionaries for multiple language pairs. We illustrate the effectiveness of joint learning for multiple languages in zero-shot word translation setting.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

1808.08773

Country:

Asia > India (0.04)
North America > United States > New York (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Study of Reinforcement Learning for Neural Machine Translation

Wu, Lijun, Tian, Fei, Qin, Tao, Lai, Jianhuang, Liu, Tie-Yan

arXiv.org Artificial IntelligenceAug-27-2018

Recent studies have shown that reinforcement learning (RL) is an effective approach for improving the performance of neural machine translation (NMT) system. However, due to its instability, successfully RL training is challenging, especially in real-world systems where deep models and large datasets are leveraged. In this paper, taking several large-scale translation tasks as testbeds, we conduct a systematic study on how to train better NMT models using reinforcement learning. We provide a comprehensive comparison of several important factors (e.g., baseline reward, reward shaping) in RL training. Furthermore, to fill in the gap that it remains unclear whether RL is still beneficial when monolingual data is used, we propose a new method to leverage RL to further boost the performance of NMT systems trained with source/target monolingual data. By integrating all our findings, we obtain competitive results on WMT14 English- German, WMT17 English-Chinese, and WMT17 Chinese-English translation tasks, especially setting a state-of-the-art performance on WMT17 Chinese-English translation task.

machine learning, monolingual data, natural language, (17 more...)

arXiv.org Artificial Intelligence

1808.08866

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation

Lin, Junyang, Sun, Xu, Ren, Xuancheng, Li, Muyu, Su, Qi

arXiv.org Artificial IntelligenceAug-26-2018

Most of the Neural Machine Translation (NMT) models are based on the sequence-to-sequence (Seq2Seq) model with an encoder-decoder framework equipped with the attention mechanism. However, the conventional attention mechanism treats the decoding at each time step equally with the same matrix, which is problematic since the softness of the attention for different types of words (e.g. content words and function words) should differ. Therefore, we propose a new model with a mechanism called Self-Adaptive Control of Temperature (SACT) to control the softness of attention by means of an attention temperature. Experimental results on the Chinese-English translation and English-Vietnamese translation demonstrate that our model outperforms the baseline models, and the analysis and the case study show that our model can attend to the most relevant elements in the source-side contexts and generate the translation of high quality.

machine learning, natural language, translation, (18 more...)

arXiv.org Artificial Intelligence

1808.07374

Country:

Asia > Middle East > Iraq (0.05)
South America > Venezuela (0.05)
Asia > China (0.05)
(2 more...)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback