Goto

Collaborating Authors

 RNN


The Unreasonable Effectiveness of Recurrent Neural Networks

#artificialintelligence

Moreover, as we'll see in a bit, RNNs combine the input vector with their state vector with a fixed (but learned) function to produce a new state vector. If training vanilla neural nets is optimization over functions, training recurrent nets is optimization over programs. At the core, RNNs have a deceptively simple API: They accept an input vector x and give you an output vector y. Written as a class, the RNN's API consists of a single step function: The RNN class has some internal state that it gets to update every time step is called.


Neural Language Modeling From Scratch (Part 1)

#artificialintelligence

The decoder is a simple function that takes a representation of the input word and returns a distribution which represents the model's predictions for the next word: the model assigns to each word the probability that it will be the next word in the sequence. This model is similar to the simple one, just that after encoding the current input word we feed the resulting representation (of size 200) into a two layer LSTM, which then outputs a vector also of size 200 (at every time step the LSTM also receives a vector representing its previous state- this is not shown in the diagram). In the input embedding, words that have similar meanings are represented by similar vectors (similar in terms of cosine similarity). Because the model would like to, given the RNN output, assign similar probability values to similar words, similar words are represented by similar vectors.


2dn4Dzq

#artificialintelligence

In blue we show the recurrent connections โ€“ the output'm' at time (t โ€“ 1) is fed back to the memory at time't' via the three gates; the cell value is fed back via the forget gate; the predicted word at time (t โ€“ 1) is fed back in addition to the memory output'm' at time't' into the Softmax for tag prediction. In spite of this fact, when we test images with multiple clothing type, our trained model generates tags for these unseen test images quite accurately ( 80% accurate). Prediction accuracy of our model improves quickly with increasing number of training iterations and stabilizes after about 20,000 iterations. Moreover, combining DCNN-RNN model helps us extend the trained model to solve completely different problem like fashion image tag generation.


Deep Learning at x.ai - x.ai

#artificialintelligence

When a RNN is trained on sequences of words, it learns to represent each word as a high dimensional vector which encodes the model's understanding of that word. If you take a step back and view the image as a whole, the large scale structure of the image is determined by words' part of speech. Nouns tend to lie in the center of the image, verbs tend to lie on the upper right side, and first names form a large orange cluster in the bottom left part of the image. The RNN learned all of this semantic understanding without a human ever having to code a definition of concepts like nouns, verbs, universities, cities, meetings, or social media.