Before learning about convolutional neural networks, one must be good with the workings of a neural network. Neural Networks imitate human brains to solve complex problems and to find patterns in the given data. Over the past few years, these have engulfed many machine learning and computer vision algorithms. The basic model, in this case, a "Neural Network" consists of neurons organized in different layers. Every Neural Network has an input and an output layer, with many hidden layers augmented to it based on the complexity of the problem.
Perhaps you've wondered how Facebook or Instagram is able to automatically recognize faces in an image, or how Google lets you search the web for similar photos just by uploading a photo of your own. These features are examples of computer vision, and they are powered by convolutional neural networks (CNNs). Yet what exactly are convolutional neural networks? Let's take a deep dive into the architecture of a CNN and understand how they operate. Before we begin talking about convolutional neural networks, let's take a moment to define regular neural networks.
What are Convolutional Neural Networks and why are they important? Convolutional Neural Networks (ConvNets or CNNs) are a category of Neural Networks that have proven very effective in areas such as image recognition and classification. ConvNets have been successful in identifying faces, objects and traffic signs apart from powering vision in robots and self driving cars. In Figure 1 above, a ConvNet is able to recognize scenes and the system is able to suggest relevant tags such as'bridge', 'railway' and'tennis' while Figure 2 shows an example of ConvNets being used for recognizing everyday objects, humans and animals. Lately, ConvNets have been effective in several Natural Language Processing tasks (such as sentence classification) as well. ConvNets, therefore, are an important tool for most machine learning practitioners today. However, understanding ConvNets and learning to use them for the first time can sometimes be an intimidating experience. The primary purpose of this blog post is to develop an understanding of how Convolutional Neural Networks work on images. If you are new to neural networks in general, I would recommend reading this short tutorial on Multi Layer Perceptrons to get an idea about how they work, before proceeding. Multi Layer Perceptrons are referred to as "Fully Connected Layers" in this post.
This article is part of Demystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI. Since the 1950s, the early days of artificial intelligence, computer scientists have been trying to build computers that can make sense of visual data. In the ensuing decades, the field, which has become known as computer vision, saw incremental advances. In 2012, computer vision took a quantum leap when a group of researchers from the University of Toronto developed an AI model that surpassed the best image recognition algorithms by a large margin. The AI system, which became known as AlexNet (named after its main creator, Alex Krizhevsky), won the 2012 ImageNet computer vision contest with an amazing 85 percent accuracy.
Over the last decade, the use of artificial neural networks (ANNs) has increased considerably. With all the buzz about deep learning and artificial neural networks, haven't you always wanted to create one for yourself? In this Keras tutorial, we'll create a model to recognize handwritten digits. We use the keras library for training the model in this tutorial. Keras is a high-level library in Python that is a wrapper over TensorFlow, CNTK and Theano.