AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections

Chatzimparmpas, Angelos, Martins, Rafael Messias, Kerren, Andreas

arXiv.org Machine LearningFeb-17-2020

t-Distributed Stochastic Neighbor Embedding (t-SNE) for the visualization of multidimensional data has proven to be a popular approach, with successful applications in a wide range of domains. Despite their usefulness, t-SNE projections can be hard to interpret or even misleading, which hurts the trustworthiness of the results. Understanding the details of t-SNE itself and the reasons behind specific patterns in its output may be a daunting task, especially for non-experts in dimensionality reduction. In this work, we present t-viSNE, an interactive tool for the visual exploration of t-SNE projections that enables analysts to inspect different aspects of their accuracy and meaning, such as the effects of hyper-parameters, distance and neighborhood preservation, densities and costs of specific neighborhoods, and the correlations between dimensions and visual patterns. We propose a coherent, accessible, and well-integrated collection of different views for the visualization of t-SNE projections. The applicability and usability of t-viSNE are demonstrated through hypothetical usage scenarios with real data sets. Finally, we present the results of a user study where the tool's effectiveness was evaluated. By bringing to light information that would normally be lost after running t-SNE, we hope to support analysts in using t-SNE and making its results better understandable.

dimension, neighborhood, projection, (15 more...)

arXiv.org Machine Learning

2002.0691

Country:

North America > United States > Wisconsin (0.04)
Europe > Portugal > Coimbra > Coimbra (0.04)
Europe > Sweden > Kronoberg County > Växjö (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.92)
Questionnaire & Opinion Survey (0.86)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Visualization (1.00)
Information Technology > Human Computer Interaction (1.00)
Information Technology > Data Science > Data Mining (1.00)
(2 more...)

Add feedback

Learning from Multiple Partially Observed Views - an Application to Multilingual Text Categorization

Amini, Massih R., Usunier, Nicolas, Goutte, Cyril

Neural Information Processing SystemsFeb-15-2020, 00:58:24 GMT

We address the problem of learning classifiers when observations have multiple views, some of which may not be observed for all examples. We assume the existence of view generating functions which may complete the missing views in an approximate way. This situation corresponds for example to learning text classifiers from multilingual collections where documents are not available in all languages. In that case, Machine Translation (MT) systems may be used to translate each document in the missing languages. We derive a generalization error bound for classifiers learned on examples with multiple artificially created views.

application, multilingual text categorization, observed view, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation

He, Tianyu, Tan, Xu, Xia, Yingce, He, Di, Qin, Tao, Chen, Zhibo, Liu, Tie-Yan

Neural Information Processing SystemsFeb-14-2020, 19:58:13 GMT

Neural Machine Translation (NMT) has achieved remarkable progress with the quick evolvement of model structures. In this paper, we propose the concept of layer-wise coordination for NMT, which explicitly coordinates the learning of hidden representations of the encoder and decoder together layer by layer, gradually from low level to high level. Specifically, we design a layer-wise attention and mixed attention mechanism, and further share the parameters of each layer between the encoder and decoder to regularize and coordinate the learning. Experiments show that combined with the state-of-the-art Transformer model, layer-wise coordination achieves improvements on three IWSLT and two WMT translation tasks. More specifically, our method achieves 34.43 and 29.01 BLEU score on WMT16 English-Romanian and WMT14 English-German tasks, outperforming the Transformer baseline.

encoder and decoder, layer-wise coordination, neural machine translation

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Learned in Translation: Contextualized Word Vectors

McCann, Bryan, Bradbury, James, Xiong, Caiming, Socher, Richard

Neural Information Processing SystemsFeb-14-2020, 18:43:49 GMT

Computer vision has benefited from initializing multiple deep layers with weights pretrained on large supervised training sets like ImageNet. Natural language processing (NLP) typically sees initialization of only the lowest layer of deep models with pretrained word vectors. In this paper, we use a deep LSTM encoder from an attentional sequence-to-sequence model trained for machine translation (MT) to contextualize word vectors. We show that adding these context vectors (CoVe) improves performance over using only unsupervised word and character vectors on a wide variety of common NLP tasks: sentiment analysis (SST, IMDb), question classification (TREC), entailment (SNLI), and question answering (SQuAD). For fine-grained sentiment analysis and entailment, CoVe improves performance of our baseline models to the state of the art.

contextualized word vector, learned, translation, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Learning to Teach with Dynamic Loss Functions

Wu, Lijun, Tian, Fei, Xia, Yingce, Fan, Yang, Qin, Tao, Jian-Huang, Lai, Liu, Tie-Yan

Neural Information Processing SystemsFeb-14-2020, 18:28:30 GMT

Teaching is critical to human society: it is with teaching that prospective students are educated and human civilization can be inherited and advanced. A good teacher not only provides his/her students with qualified teaching materials (e.g., textbooks), but also sets up appropriate learning objectives (e.g., course projects and exams) considering different situations of a student. When it comes to artificial intelligence, treating machine learning models as students, the loss functions that are optimized act as perfect counterparts of the learning objective set by the teacher. In this work, we explore the possibility of imitating human teaching behaviors by dynamically and automatically outputting appropriate loss functions to train machine learning models. Different from typical learning settings in which the loss function of a machine learning model is predefined and fixed, in our framework, the loss function of a machine learning model (we call it student) is defined by another machine learning model (we call it teacher).

dynamic loss function, loss function, student, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.38)

Add feedback

Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models

Zhang, Minjia, Wang, Wenhan, Liu, Xiaodong, Gao, Jianfeng, He, Yuxiong

Neural Information Processing SystemsFeb-14-2020, 18:26:07 GMT

Neural language models (NLMs) have recently gained a renewed interest by achieving state-of-the-art performance across many natural language processing (NLP) tasks. However, NLMs are very computationally demanding largely due to the computational cost of the decoding process, which consists of a softmax layer over a large vocabulary.We observe that in the decoding of many NLP tasks, only the probabilities of the top-K hypotheses need to be calculated preciously and K is often much smaller than the vocabulary size. This paper proposes a novel softmax layer approximation algorithm, called Fast Graph Decoder (FGD), which quickly identifies, for a given context, a set of K words that are most likely to occur according to a NLM. We demonstrate that FGD reduces the decoding time by an order of magnitude while attaining close to the full softmax baseline accuracy on neural machine translation and language modeling tasks. We also prove the theoretical guarantee on the softmax approximation quality. Papers published at the Neural Information Processing Systems Conference.

fast and scalable decoding, graph representation, neural language model, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models

Pan, Boyuan, Yang, Yazheng, Li, Hao, Zhao, Zhou, Zhuang, Yueting, Cai, Deng, He, Xiaofei

Neural Information Processing SystemsFeb-14-2020, 17:56:39 GMT

Machine Comprehension (MC) is one of the core problems in natural language processing, requiring both understanding of the natural language and knowledge about the world. Rapid progress has been made since the release of several benchmark datasets, and recently the state-of-the-art models even surpass human performance on the well-known SQuAD evaluation. In this paper, we transfer knowledge learned from machine comprehension to the sequence-to-sequence tasks to deepen the understanding of the text. We propose MacNet: a novel encoder-decoder supplementary architecture to the widely used attention-based sequence-to-sequence models. Experiments on neural machine translation (NMT) and abstractive text summarization show that our proposed framework can significantly improve the performance of the baseline models, and our method for the abstractive text summarization achieves the state-of-the-art results on the Gigaword dataset.

machine comprehension, sequence-to-sequence model, transferring knowledge, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

SVD-Softmax: Fast Softmax Approximation on Large Vocabulary Neural Networks

Shim, Kyuhong, Lee, Minjae, Choi, Iksoo, Boo, Yoonho, Sung, Wonyong

Neural Information Processing SystemsFeb-14-2020, 17:27:31 GMT

We propose a fast approximation method of a softmax function with a very large vocabulary using singular value decomposition (SVD). The proposed method transforms the weight matrix used in the calculation of the output vector by using SVD. The approximate probability of each word can be estimated with only a small part of the weight matrix by using a few large singular values and the corresponding elements for most of the words. We applied the technique to language modeling and neural machine translation and present a guideline for good approximation. The algorithm requires only approximately 20\% of arithmetic operations for an 800K vocabulary case and shows more than a three-fold speedup on a GPU.

fast softmax approximation, svd-softmax, vocabulary neural network, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Can Active Memory Replace Attention?

Kaiser, Łukasz, Bengio, Samy

Neural Information Processing SystemsFeb-14-2020, 14:43:25 GMT

Several mechanisms to focus attention of a neural network on selected parts of its input or memory have been used successfully in deep learning models in recent years. Attention has improved image classification, image captioning, speech recognition, generative models, and learning algorithmic tasks, but it had probably the largest impact on neural machine translation. Recently, similar improvements have been obtained using alternative mechanisms that do not focus on a single part of a memory but operate on all of it in parallel, in a uniform way. Such mechanism, which we call active memory, improved over attention in algorithmic tasks, image processing, and in generative modelling. So far, however, active memory has not improved over attention for most natural language processing tasks, in particular for machine translation.

active memory replace attention, machine translation, mechanism, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)

Add feedback

Deliberation Networks: Sequence Generation Beyond One-Pass Decoding

Xia, Yingce, Tian, Fei, Wu, Lijun, Lin, Jianxin, Qin, Tao, Yu, Nenghai, Liu, Tie-Yan

Neural Information Processing SystemsFeb-14-2020, 08:57:32 GMT

The encoder-decoder framework has achieved promising progress for many sequence generation tasks, including machine translation, text summarization, dialog system, image captioning, etc. Such a framework adopts an one-pass forward process while decoding and generating a sequence, but lacks the deliberation process: A generated sequence is directly used as final output without further polishing. However, deliberation is a common behavior in human's daily life like reading news and writing papers/articles/books. In this work, we introduce the deliberation process into the encoder-decoder framework and propose deliberation networks for sequence generation. A deliberation network has two levels of decoders, where the first-pass decoder generates a raw sequence and the second-pass decoder polishes and refines the raw sentence with deliberation.

deliberation network, one-pass decoding, sequence generation, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.66)

Add feedback