How to Code and Understand DeepMind's Neural Stack Machine - i am trask

#artificialintelligence 

For more on derivatives and differentiability, see the rest of that tutorial.) Why do we care that the stack (as a function) is differentiable? Well, we used the "derivative" of the function to move the error around (more specifically... to backpropagate). For more on this, please see the Tutorial I Wrote on Basic Neural Networks, Gradient Descent, and Recurrent Neural Networks. I particularly recommend the last one because it demontrates backpropgating through somewhat more arbitrary vector operations... kindof like what we're going to do here.