Recurrent Networks: Second Order Properties and Pruning
–Neural Information Processing Systems
Second order properties of cost functions for recurrent networks are investigated. We analyze a layered fully recurrent architecture, the virtue of this architecture is that it features the conventional feedforward architecture as a special case. A detailed description of recursive computation of the full Hessian of the network cost func(cid:173) tion is provided. We discuss the possibility of invoking simplifying approximations of the Hessian and show how weight decays iron the cost function and thereby greatly assist training. We present tenta(cid:173) tive pruning results, using Hassibi et al.'s Optimal Brain Surgeon, demonstrating that recurrent networks can construct an efficient internal memory.
Neural Information Processing Systems
Apr-6-2023, 18:46:52 GMT
- Technology: