When less is more: evolving large neural networks from small ones

Radhakrishnan, Anil, Lindner, John F., Miller, Scott T., Sinha, Sudeshna, Ditto, William L.

arXiv.org Artificial Intelligence 

In contrast to conventional artificial neural networks, which are large and structurally static, we study feed-forward neural networks that are small and dynamic, whose nodes can be added (or subtracted) during training. A single neuronal weight in the network controls the network's size, while the weight itself is optimized by the same gradient-descent algorithm that optimizes the network's other weights and biases, but with a size-dependent objective or loss function. We train and evaluate such Nimble Neural Networks on nonlinear regression and classification tasks where they outperform the corresponding static networks. Growing networks to minimal, appropriate, or optimal sizes while training elucidates network dynamics and contrasts with pruning large networks after training but before deployment.