Neural Language Modeling From Scratch (Part 1)

#artificialintelligence 

The decoder is a simple function that takes a representation of the input word and returns a distribution which represents the model's predictions for the next word: the model assigns to each word the probability that it will be the next word in the sequence. This model is similar to the simple one, just that after encoding the current input word we feed the resulting representation (of size 200) into a two layer LSTM, which then outputs a vector also of size 200 (at every time step the LSTM also receives a vector representing its previous state- this is not shown in the diagram). In the input embedding, words that have similar meanings are represented by similar vectors (similar in terms of cosine similarity). Because the model would like to, given the RNN output, assign similar probability values to similar words, similar words are represented by similar vectors.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found