Recurrent Networks: Second Order Properties and Pruning
Pedersen, Morten With, Hansen, Lars Kai
–Neural Information Processing Systems
Second order properties of cost functions for recurrent networks are investigated. We analyze a layered fully recurrent architecture, the virtue of this architecture is that it features the conventional feedforward architecture as a special case. A detailed description of recursive computation of the full Hessian of the network cost function isprovided. We discuss the possibility of invoking simplifying approximations of the Hessian and show how weight decays iron the cost function and thereby greatly assist training. We present tentative pruningresults, using Hassibi et al.'s Optimal Brain Surgeon, demonstrating that recurrent networks can construct an efficient internal memory. 1 LEARNING IN RECURRENT NETWORKS Time series processing is an important application area for neural networks and numerous architectures have been suggested, see e.g.
Neural Information Processing Systems
Dec-31-1995