In this post, I'll discuss commonly used architectures for convolutional networks. As you'll see, almost all CNN architectures follow the same general design principles of successively applying convolutional layers to the input, periodically downsampling the spatial dimensions while increasing the number of feature maps. While the classic network architectures were comprised simply of stacked convolutional layers, modern architectures explore new and innovative ways for constructing convolutional layers in a way which allows for more efficient learning. Almost all of these architectures are based on a repeatable unit which is used throughout the network. These architectures serve as general design guidelines which machine learning practitioners will then adapt to solve various computer vision tasks.
In this work, we investigate the value of employing deep learning for the task of wireless signal modulation recognition. Recently in , a framework has been introduced by generating a dataset using GNU radio that mimics the imperfections in a real wireless channel, and uses 10 different modulation types. Further, a convolutional neural network (CNN) architecture was developed and shown to deliver performance that exceeds that of expert-based approaches. Here, we follow the framework of  and find deep neural network architectures that deliver higher accuracy than the state of the art. We tested the architecture of  and found it to achieve an accuracy of approximately 75% of correctly recognizing the modulation type. We first tune the CNN architecture of  and find a design with four convolutional layers and two dense layers that gives an accuracy of approximately 83.8% at high SNR. We then develop architectures based on the recently introduced ideas of Residual Networks (ResNet ) and Densely Connected Networks (DenseNet ) to achieve high SNR accuracies of approximately 83.5% and 86.6%, respectively. Finally, we introduce a Convolutional Long Short-term Deep Neural Network (CLDNN ) to achieve an accuracy of approximately 88.5% at high SNR.
What are Convolutional Neural Networks and why are they important? Convolutional Neural Networks (ConvNets or CNNs) are a category of Neural Networks that have proven very effective in areas such as image recognition and classification. ConvNets have been successful in identifying faces, objects and traffic signs apart from powering vision in robots and self driving cars. In Figure 1 above, a ConvNet is able to recognize scenes and the system is able to suggest relevant tags such as'bridge', 'railway' and'tennis' while Figure 2 shows an example of ConvNets being used for recognizing everyday objects, humans and animals. Lately, ConvNets have been effective in several Natural Language Processing tasks (such as sentence classification) as well. ConvNets, therefore, are an important tool for most machine learning practitioners today. However, understanding ConvNets and learning to use them for the first time can sometimes be an intimidating experience. The primary purpose of this blog post is to develop an understanding of how Convolutional Neural Networks work on images. If you are new to neural networks in general, I would recommend reading this short tutorial on Multi Layer Perceptrons to get an idea about how they work, before proceeding. Multi Layer Perceptrons are referred to as "Fully Connected Layers" in this post.
Over the past few years, much of the progress in deep learning for computer vision can be boiled down to just a handful of neural network architectures. Setting aside all the math, the code, and the implementation details, I wanted to explore one simple question: how and why do these models work? The VGG networks, along with the earlier AlexNet from 2012, follow the now archetypal layout of basic conv nets: a series of convolutional, max-pooling, and activation layers before some fully-connected classification layers at the end. MobileNet is essentially a streamlined version of the Xception architecture optimized for mobile applications. The remaining three, however, truly redefine the way we look at neural networks.