Hyper Networks

#artificialintelligence 

In this post, I will talk about our recent paper called [1609.09106] I worked on this paper as a Google Brain Resident - a great research program where we can work on machine learning research for a whole year, with a salary and benefits! The Brain team is now accepting applications for the 2017 program: see g.co/brainresidency. The weight matrices of the LSTM are changing over time. Most modern neural network architectures are either a deep ConvNet, or a long RNN, or some combination of the two. These two architectures seem to be at opposite ends of a spectrum. Recurrent Networks can be viewed as a really deep feed forward network with the identical weights at each layer (this is called weight-tying). A deep ConvNet allows each layer to be different.