Topological obstructions in neural networks learning
Barannikov, Serguei, Sotnikov, Grigorii, Trofimov, Ilya, Korotin, Alexander, Burnaev, Evgeny
–arXiv.org Artificial Intelligence
We apply methods of topological data analysis to loss functions to gain insights on learning of deep neural networks and their generalization properties We study global properties of the loss function's gradient flow. We use topological data analysis of the loss function and its Morse complex to relate local behaviour along gradient trajectories with global properties of the loss surface. We define neural network's Topological Obstructions' score («TOscore») with help of robust topological invariants (barcodes of loss function) that quantify the "badness" of local minima for gradient-based optimization. We have made several experiments for computing these invariants, for small neural networks, and for fully connected, convolutional and ResNetlike neural networks on different datasets: MNIST, Fashion MNIST, CIFAR10, SVHN. Our two principal observations are 1) the neural network's barcode and TOscore decrease with the increase of the neural network's depth and width 2) there is an intriguing connection between the length of minima's segments in the barcode and the minima's generalization error. Introduction Mathematically, if one opens the "black box" of deep learning, there are two immediate mysteries.
arXiv.org Artificial Intelligence
Dec-31-2020
- Country:
- Europe > France > Île-de-France > Paris > Paris (0.14)
- Genre:
- Research Report (0.64)
- Technology: