Parallel Attention Mechanisms in Neural Machine Translation

Oct-29-2018–arXiv.org Artificial Intelligence

Abstract--Recent papers in neural machine translation have proposed the strict use of attention mechanisms over previous standards such as recurrent and convolutional neural networks (RNNs and CNNs). We propose that by running traditionally stacked encoding branches from encoder-decoder attentionfocused architectures in parallel, that even more sequential operations can be removed from the model and thereby decrease training time. In particular, we modify the recently published attention-based architecture called Transformer by Google, by replacing sequential attention modules with parallel ones, reducing the amount of training time and substantially improving BLEU scores at the same time. Experiments over the English to German and English to French translation tasks show that our model establishes a new state of the art. Historically, statistical machine translation involved extensive work in the alignment of words and phrases developed by linguistic experts working with computer scientists [1].

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

Oct-29-2018

arXiv.org PDF

Add feedback

Country:
- North America > United States > Colorado (0.16)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Machine Translation (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found