Convolutional Neural Network (CNN) is the state-of-the-art for image classification task. Here we have briefly discussed different components of CNN. In this paper, We have explained different CNN architectures for image classification. Through this paper, we have shown advancements in CNN from LeNet-5 to latest SENet model. We have discussed the model description and training details of each model. We have also drawn a comparison among those models.
In this post, I'll discuss commonly used architectures for convolutional networks. As you'll see, almost all CNN architectures follow the same general design principles of successively applying convolutional layers to the input, periodically downsampling the spatial dimensions while increasing the number of feature maps. While the classic network architectures were comprised simply of stacked convolutional layers, modern architectures explore new and innovative ways for constructing convolutional layers in a way which allows for more efficient learning. Almost all of these architectures are based on a repeatable unit which is used throughout the network. These architectures serve as general design guidelines which machine learning practitioners will then adapt to solve various computer vision tasks.
Convolutional neural networks are fantastic for visual recognition tasks. Good ConvNets are beasts with millions of parameters and many hidden layers. In fact, a bad rule of thumb is: 'higher the number of hidden layers, better the network'. AlexNet, VGG, Inception, ResNet are some of the popular networks. Why do these networks work so well?
Computer vision is a subject to convert images and videos into machine-understandable signals. With these signals, programmers can further control the behavior of the machine based on this high-level understanding. Among many computer vision tasks, image classification is one of the most fundamental ones. It not only can be used in lots of real products like Google Photo's tagging and AI content moderation but also opens a door for lots of more advanced vision tasks, such as object detection and video understanding. Due to the rapid changes in this field since the breakthrough of Deep Learning, beginners often find it too overwhelming to learn. Unlike typical software engineering subjects, there are not many great books about image classification using DCNN, and the best way to understand this field is though reading academic papers. But what papers to read? In this article, I'm going to introduce 10 best papers for beginners to read.