Goto

Collaborating Authors

 pre-trained dnn





Ranking and Rejecting of Pre-Trained Deep Neural Networks in Transfer Learning based on Separation Index

arXiv.org Machine Learning

Automated ranking of pre-trained Deep Neural Networks (DNNs) reduces the required time for selecting optimal pre-trained DNN and boost the classification performance in transfer learning. In this paper, we introduce a novel algorithm to rank pre-trained DNNs by applying a straightforward distance-based complexity measure named Separation Index (SI) to the target dataset. For this purpose, at first, a background about the SI is given and then the automated ranking algorithm is explained. In this algorithm, the SI is computed for the target dataset which passes from the feature extracting parts of pre-trained DNNs. Then, by descending sort of the computed SIs, the pre-trained DNNs are ranked, easily. In this ranking method, the best DNN makes maximum SI on the target dataset and a few pre-trained DNNs may be rejected in the case of their sufficiently low computed SIs. The efficiency of the proposed algorithm is evaluated by using three challenging datasets including Linnaeus 5, Breast Cancer Images, and COVID-CT. For the two first case studies, the results of the proposed algorithm exactly match with the ranking of the trained DNNs by the accuracy on the target dataset. For the third case study, despite using different preprocessing on the target data, the ranking of the algorithm has a high correlation with the ranking resulted from classification accuracy.


High-contrast "gaudy" images improve the training of deep neural network models of visual cortex

arXiv.org Machine Learning

A key challenge in understanding the sensory transformations of the visual system is to obtain a highly predictive model of responses from visual cortical neurons. Deep neural networks (DNNs) provide a promising candidate for such a model. However, DNNs require orders of magnitude more training data than neuroscientists can collect from real neurons because experimental recording time is severely limited. This motivates us to find images that train highly-predictive DNNs with as little training data as possible. We propose gaudy images---high-contrast binarized versions of natural images---to efficiently train DNNs. In extensive simulation experiments, we find that training DNNs with gaudy images substantially reduces the number of training images needed to accurately predict the simulated responses of visual cortical neurons. We also find that gaudy images, chosen before training, outperform images chosen during training by active learning algorithms. Thus, gaudy images overemphasize features of natural images, especially edges, that are the most important for efficiently training DNNs. We believe gaudy images will aid in the modeling of visual cortical neurons, potentially opening new scientific questions about visual processing, as well as aid general practitioners that seek ways to improve the training of DNNs.


Knowledge Isomorphism between Neural Networks

arXiv.org Machine Learning

This paper aims to analyze knowledge isomorphism between pre-trained deep neural networks. We propose a generic definition for knowledge isomorphism between neural networks at different fuzziness levels, and design a task-agnostic and model-agnostic method to disentangle and quantify isomorphic features from intermediate layers of a neural network. As a generic tool, our method can be broadly used for different applications. In preliminary experiments, we have used knowledge isomorphism as a tool to diagnose feature representations of neural networks. Knowledge isomorphism provides new insights to explain the success of existing deep-learning techniques, such as knowledge distillation and network compression. More crucially, it has been shown that knowledge isomorphism can also be used to refine pre-trained networks and boost performance.


Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks

arXiv.org Machine Learning

Given two or more Deep Neural Networks (DNNs) with the same or similar architectures, and trained on the same dataset, but trained with different solvers, parameters, hyper-parameters, regularization, etc., can we predict which DNN will have the best test accuracy, and can we do so without peeking at the test data? In this paper, we show how to use a new Theory of Heavy-Tailed Self-Regularization (HT-SR) to answer this. HT-SR suggests, among other things, that modern DNNs exhibit what we call Heavy-Tailed Mechanistic Universality (HT-MU), meaning that the correlations in the layer weight matrices can be fit to a power law with exponents that lie in common Universality classes from Heavy-Tailed Random Matrix Theory (HT-RMT). From this, we develop a Universal capacity control metric that is a weighted average of these PL exponents. Rather than considering small toy NNs, we examine over 50 different, large-scale pre-trained DNNs, ranging over 15 different architectures, trained on ImagetNet, each of which has been reported to have different test accuracies. We show that this new capacity metric correlates very well with the reported test accuracies of these DNNs, looking across each architecture (VGG16/.../VGG19, ResNet10/.../ResNet152, etc.). We also show how to approximate the metric by the more familiar Product Norm capacity measure, as the average of the log Frobenius norm of the layer weight matrices. Our approach requires no changes to the underlying DNN or its loss function, it does not require us to train a model (although it could be used to monitor training), and it does not even require access to the ImageNet data.


Microsoft Cranks AI Efforts Up To 11

Forbes - Tech

Microsoft is justifiably proud of its hardware/software co-design approach to accelerating a wide range of data center workloads using Intel FPGAs. The company recently shared some progress in this area and subsequently announced its acquisition of the AI startup Bonsai to ease the on-ramp for building AI on Microsoft. These advances give more advantages to the Microsoft AI strategy and warrant further analysis. I believe the company is very well-positioned to lead the penetration of AI into the enterprise market, where its productivity software and cloud success give it a springboard for growth. The Brainwave project uses large arrays of Intel FPGAs to accelerate deep neural network (DNN) inference processing for search, ad targeting, facial recognition, and more.