ResNet50 is a convolutional neural network which has a depth of 50 layers. It was build and trained by Microsoft in 2015 and you can access the model performance results on their paper, titled Deep Residual Learning for Image Recognition. This model is also trained on more than 1 million images from the ImageNet database. Just like VGG-19, it can classify up to 1000 objects and the network was trained on 224x224 pixels colored images.
I hope that you would now be able to apply pre-trained models to your problem statements. Be sure that the pre-trained model you have selected has been trained on a similar data set as the one that you wish to use it on. There are various architectures people have tried on different types of data sets and I strongly encourage you to go through these architectures and apply them on your own problem statements. Please feel free to discuss your doubts and concerns in the comments section.
Fig: The model summary of the second network showing the fixed and trainable weights. The fixed weights are transferred directly from the first network. Now we train the second model and observe how it takes less overall time and still gets equal or higher performance. The accuracy of the second model is even higher than the first model, although this may not be the case all the time, and depends on the model architecture and dataset. Fig: Validation set accuracy over epochs while training the second network.
Highlights: In this post we are going to show how to build a computer vision model without building it from scratch. The idea behind transfer learning is that a neural network that has been trained on a large dataset can apply its knowledge to a dataset that it has never seen before. That is, why it's called a transfer learning; we transfer the learning of an existing model to a new dataset. Previously we have explored how to improve the models performance using a data augmentation. The question now is, "what if we don't have enough data to train our network from scratch?".