In this paper, we propose a generic technique to model temporal dependencies and sequences using a combination of a recurrent neural network and a Deep Belief Network. Our technique, RNN-DBN, is an amalgamation of the memory state of the RNN that allows it to provide temporal information and a multi-layer DBN that helps in high level representation of the data. This makes RNN-DBNs ideal for sequence generation. Further, the use of a DBN in conjunction with the RNN makes this model capable of significantly more complex data representation than an RBM. We apply this technique to the task of polyphonic music generation.
Let me open this article with a question – "working love learning we on deep", did this make any sense to you? Not really – read this one – "We love working on deep learning". A little jumble in the words made the sentence incoherent. Well, can we expect a neural network to make sense out of it? If the human brain was confused on what it meant I am sure a neural network is going to have a tough time deciphering such text.
Learning with recurrent neural networks (RNNs) on long sequences is a notoriously difficult task. There are three major challenges: 1) complex dependencies, 2) vanishing and exploding gradients, and 3) efficient parallelization. In this paper, we introduce a simple yet effective RNN connection structure, the DilatedRNN, which simultaneously tackles all of these challenges. The proposed architecture is characterized by multi-resolution dilated recurrent skip connections and can be combined flexibly with diverse RNN cells. Moreover, the DilatedRNN reduces the number of parameters needed and enhances training efficiency significantly, while matching state-of-the-art performance (even with standard RNN cells) in tasks involving very long-term dependencies.
We have recently shown that when initiated with "small" weights, many connectionist models with feedback connections are inherently biased towards Markov models, i.e. even prior to any training, dynamics of the models can be readily used to extract finite memory machines (Tiňo, Čerňanský, & Beňušková 2004; Hammer & Tiňo 2003). In this study we briefly outline the core arguments for such claims and generalize the results to recursive neural networks capable of processing ordered trees. In the early stages of learning, the compositional organization of recursive activations has a Markovian structure: Trees sharing a top subtree are mapped close to each other. The deeper is the shared subtree, the closer are the trees mapped.