Many machine learning tasks can be expressed as the transformation---or \emph{transduction}---of input sequences into output sequences: speech recognition, machine translation, protein secondary structure prediction and text-to-speech to name but a few. One of the key challenges in sequence transduction is learning to represent both the input and output sequences in a way that is invariant to sequential distortions such as shrinking, stretching and translating. Recurrent neural networks (RNNs) are a powerful sequence learning architecture that has proven capable of learning such representations. However RNNs traditionally require a pre-defined alignment between the input and output sequences to perform transduction. This is a severe limitation since \emph{finding} the alignment is the most difficult aspect of many sequence transduction problems. Indeed, even determining the length of the output sequence is often challenging. This paper introduces an end-to-end, probabilistic sequence transduction system, based entirely on RNNs, that is in principle able to transform any input sequence into any finite, discrete output sequence. Experimental results for phoneme recognition are provided on the TIMIT speech corpus.
In the above diagram there are many concepts needed to understand and a brief explanation of these concepts is beyond this article and you can find here. These are Kernel or Filter, Relu, Pooling, Convolution, Stride, Flatten, Fully connected. RNNs can be widely used in Text Analysis, Sequence Models, Time-Series Analysis and Video Processing. Whereas in Language Modeling tasks require modeling a large number of possible values (words in the vocabulary) per input feature. RNN and its variants can be easily understand in Colah's Blog.
Summerville, Adam James (University of California, Santa Cruz) | Mateas, Michael (University of California, Santa Cruz)
Procedural Content Generation (PCG) has seen heavy focus on the generation of levels for video games, aesthetic content, and on rule creation, but has seen little use in other domains. Recently, the ready availability of Long Short Term Memory Recurrent Neural Networks (LSTM RNNs) has seen a rise in text based procedural generation, including card designs for Collectible Card Games (CCGs) like Hearthstone or Magic: The Gathering . In this work we present a mixed-initiative design tool, Mystical Tutor, that allows a user to type in a partial specification for a card and receive a full card design. This is achieved by using sequence-to-sequence learning as a denoising sequence autoencoder, allowing Mystical Tutor to learn how to translate from partial specifications to full.
So today, I will continue my journey to Bio-informatics with Machine Learning. And I will try to perform the most basic task in Bio-informatics, which is converting DNA sequence to Protein. Also, this is over complicating the task, we can just build a dictionary to map the values, as done by Vijini Mallawaarachchi in this post. Also, please take note that we are going to preprocess the DNA / Protein sequence to vectors, if you are not aware of how to do that, please see this post. Finally, I am going to perform Dilated Back Propagation to train our network.