The Attention Mechanism in Natural Language Processing - seq2seq

#artificialintelligence 

The Attention mechanism is now an established technique in many NLP tasks. I've heard about it often, but wanted to go a bit more deep and understand the details. In this first blog post - since I plan to publish a few more blog posts regarding the attention subject - I make an introduction by focusing in the first proposal of attention mechanism, as applied to the task of neural machine translation. To the best of my knowledge the attention mechanism within the context of NLP was first presented in "Neural Machine Translation by Jointly Learning to Align and Translate" at ICLR 2015 (Bahdanau et al. 2015). This was proposed in the context of machine translation, where given a sentence in one language, the model has to produce a translation for that sentence in another language.