In Part I, we saw a few examples of image classification. In particular counting objects seemed to be difficult for convolutional neural networks. After sharing my work on the fast.ai Now we can create a learner and train it on this new dataset. Wow! Look at that, this time we're getting 100% accuracy.
"Deep Learning Is Setting Records!!" There is tremendous growth in people searching or showing interests about deep learning & AI in last few years. Every day hundreds of new articles get published on it in social media & press media. Above chart broadly explains as why search trend is ever growing for deep learning & AI. Fundamentally deep learning is a subset of Machine Learning. The reason as why it is exciting is that more data you give to deep learning usually you get more accuracy out from the model.
As massively parallel computations have become broadly available with modern GPUs, deep architectures trained on very large datasets have risen in popularity. Discriminatively trained convolutional neural networks, in particular, were recently shown to yield state-of-the-art performance in challenging image classification benchmarks such as ImageNet. However, elements of these architectures are similar to standard hand-crafted representations used in computer vision. In this paper, we explore the extent of this analogy, proposing a version of the state-of-the-art Fisher vector image encoding that can be stacked in multiple layers. This architecture significantly improves on standard Fisher vectors, and obtains competitive results with deep convolutional networks at a significantly smaller computational cost.
This paper concerns the undetermined problem of estimating geometric transformation between image pairs. Recent methods introduce deep neural networks to predict the controlling parameters of hand-crafted geometric transformation models (e.g. However, the low-dimension parametric models are incapable of estimating a highly complex geometric transform with limited flexibility to model the actual geometric deformation from image pairs. To address this issue, we present an end-to-end trainable deep neural networks, named Arbitrary Continuous Geometric Transformation Networks (Arbicon-Net), to directly predict the dense displacement field for pairwise image alignment. Arbicon-Net is generalized from training data to predict the desired arbitrary continuous geometric transformation in a data-driven manner for unseen new pair of images.
To build an Image Search Engine that retrieves the most similar images from the database based on specific target images. Given a query image (containing a specific instance) and a collection of images with different contents, we want to find the images that contain the same query instance from the collection. The below images are two examples of query images (original cropped). The image below is the query result using ResNet transfer learning. Since I have ten query images, there are ten rows of images, with each row containing the ten most similar images to the query image.