I've been searching for a while now to find the precise way to feed a Recurrent Neural Network (RNN, LSTM, GRU, ESN, Etc) with time series data with no real success. Here is a question that was close, but the answers aren't very clear: Proper way of using recurrent neural network for time series analysis I'm not looking for a breakdown of how the networks work, but rather how to structure the input/output vectors for optimal results. So, let's say I'm working with a growing sinusoid. I am very familiar with the sliding time window approach that works well with Feed forward networks (FFN). And this works very well with FFN (And with RNN also), but I'm led to believe RNN shouldn't need to be setup that way.

This lecture will cover recurrent neural networks, the key ingredient in the deep learning toolbox for handling sequential computation and modelling sequences. It will start by explaining how gradients can be computed (by considering the time-unfolded graph) and how different architectures can be designed to summarize a sequence, generate a sequence by ancestral sampling in a fully-observed directed model, or learn to map a vector to a sequence, a sequence to a sequence (of the same or different length) or a sequence to a vector. The issue of long-term dependencies, why it arises, and what has been proposed to alleviate it will be core subject of the discussion in this lecture. This includes changes in the architecture and initialization, as well as how to properly characterize the architecture in terms of recurrent or feedforward depth and its ability to create shortcuts or fast propagation of gradients in the unfolded graph. Open questions regarding the limitations of training by maximum likelihood (teacher forcing) and ideas towards towards making learning online (not requiring backprop through time) will also be discussed.

My current understanding of feature vectors is very limited. Lets say we are building a classifier using standard Convolutional Neural Network (CNN) and Fully-connected Neural Network (FNN). Where CNN is responsible of extracting high level features from an image (starting from edges and corners to faces etc). In the transition from the CNN to FNN we would most likely vectorize the images, this is where I would call that vector a feature vector. I was surprised by the fact that even raw pixel values can be thought of as a feature vector.

I want to create a deep neural network who's first layer maps words to vector embeddings. Essentially: Y Wx b where Y is the corresponding word embedding and x is the one-hot encoded vector for a word. However, I found feeding such a large x as a placeholder for each word full of zero makes the model extremely slow, so I was wondering if there are any alternatives aside from running word2vec in C and then just feeding the already embedded word vectors in.

I have five sets of points who represent functions, each is made by multiplying a constant by a vector of 10,000 numbers between 0 and 1, but I don't know what the constants are and I don't know what the vectors where in each case, I suspect they are the same linearly spaced vector because that's what my profesor would do. I have to use a neural network to find the constants. My idea was to make one neuron with five inputs between 0 and 1 and train it. It would multiply the numbers by the weights, add them up and see if it is the same as the sum of the five functions I have. When it ended it's training I would see what the final weights are and those are my constants.