Image Matching


A Simple Cache Model for Image Recognition

Neural Information Processing Systems

Training large-scale image recognition models is computationally expensive. This raises the question of whether there might be simple ways to improve the test performance of an already trained model without having to re-train or fine-tune it with new data. Here, we show that, surprisingly, this is indeed possible. The key observation we make is that the layers of a deep network close to the output layer contain independent, easily extractable class-relevant information that is not contained in the output layer itself. We propose to extract this extra class-relevant information using a simple key-value cache memory to improve the classification performance of the model at test time.


This Looks Like That: Deep Learning for Interpretable Image Recognition

Neural Information Processing Systems

When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or another. The mounting evidence for each of the classes helps us make our final decision. In this work, we introduce a deep network architecture -- prototypical part network (ProtoPNet), that reasons in a similar way: the network dissects the image by finding prototypical parts, and combines evidence from the prototypes to make a final classification. The model thus reasons in a way that is qualitatively similar to the way ornithologists, physicians, and others would explain to people on how to solve challenging image classification tasks. The network uses only image-level labels for training without any annotations for parts of images.


Recurrent Registration Neural Networks for Deformable Image Registration

Neural Information Processing Systems

Parametric spatial transformation models have been successfully applied to image registration tasks. In such models, the transformation of interest is parameterized by a fixed set of basis functions as for example B-splines. Each basis function is located on a fixed regular grid position among the image domain because the transformation of interest is not known in advance. As a consequence, not all basis functions will necessarily contribute to the final transformation which results in a non-compact representation of the transformation. For each element in the sequence, a local deformation defined by its position, shape, and weight is computed by our recurrent registration neural network.


Bilevel Distance Metric Learning for Robust Image Recognition

Neural Information Processing Systems

Metric learning, aiming to learn a discriminative Mahalanobis distance matrix M that can effectively reflect the similarity between data samples, has been widely studied in various image recognition problems. Most of the existing metric learning methods input the features extracted directly from the original data in the preprocess phase. What's worse, these features usually take no consideration of the local geometrical structure of the data and the noise existed in the data, thus they may not be optimal for the subsequent metric learning task. In this paper, we integrate both feature extraction and metric learning into one joint optimization framework and propose a new bilevel distance metric learning model. Specifically, the lower level characterizes the intrinsic data structure using graph regularized sparse coefficients, while the upper level forces the data samples from the same class to be close to each other and pushes those from different classes far away.


Arbicon-Net: Arbitrary Continuous Geometric Transformation Networks for Image Registration

Neural Information Processing Systems

This paper concerns the undetermined problem of estimating geometric transformation between image pairs. Recent methods introduce deep neural networks to predict the controlling parameters of hand-crafted geometric transformation models (e.g. However, the low-dimension parametric models are incapable of estimating a highly complex geometric transform with limited flexibility to model the actual geometric deformation from image pairs. To address this issue, we present an end-to-end trainable deep neural networks, named Arbitrary Continuous Geometric Transformation Networks (Arbicon-Net), to directly predict the dense displacement field for pairwise image alignment. Arbicon-Net is generalized from training data to predict the desired arbitrary continuous geometric transformation in a data-driven manner for unseen new pair of images.


Artificial intelligence: Towards a better understanding of the underlying mechanisms

#artificialintelligence

The automatic identification of complex features in images has already become a reality thanks to artificial neural networks. Some examples of software exploiting this technique are Facebook's automatic tagging system, Google's image search engine and the animal and plant recognition system used by iNaturalist. We know that these networks are inspired by the human brain, but their working mechanism is still mysterious. New research, conducted by SISSA in association with the Technical University of Munich and published for the 33rd Annual NeurIPS Conference, proposes a new approach for studying deep neural networks and sheds new light on the image elaboration processes that these networks are able to carry out. Similar to what happens in the visual system, neural networks used for automatic image recognition analyse the content progressively, through a chain of processing stages.


Artificial intelligence: Towards a better understanding of the underlying mechanisms

#artificialintelligence

The automatic identification of complex features in images has already become a reality thanks to artificial neural networks. Some examples of software exploiting this technique are Facebook's automatic tagging system, Google's image search engine and the animal and plant recognition system used by iNaturalist. We know that these networks are inspired by the human brain, but their working mechanism is still mysterious. New research, conducted by SISSA in association with the Technical University of Munich and published for the 33rd Annual NeurIPS Conference, proposes a new approach for studying deep neural networks and sheds new light on the image elaboration processes that these networks are able to carry out. Similar to what happens in the visual system, neural networks used for automatic image recognition analyse the content progressively, through a chain of processing stages.


Computer vision API- Skyl.ai

#artificialintelligence

Computer vision APIs let you run computer vision tasks programmatically at scale in real time. Once set up, the computer vision API can run computer vision tasks simultaneously on millions of data. This makes it easy to integrate these APIs into your apps or websites and deliver cutting edge computer vision backed experiences to your customers easily. For example, you might have a reverse image search engine which takes in a photo as an input and returns a set of similar images from the web. You can implement this in no time using computer vision APIs even though you do not have any expertise in machine learning or computer vision.


Alibaba's New AI Chip Can Process Nearly 80K Images Per Second

#artificialintelligence

The Hanguang 800 is being implemented across many application scenarios within Aliyun, ranging from video classification to smart city applications. For example, the company's popular Pailitao platform applies visual image search to e-commerce, allowing customers to search for items by taking a photo of the query object. Using AI-based image recognition & indexing powered by the new Hanguang 800, Aliyun can increase image processing efficiency by 12 times compared to GPUs. With regard to smart city tech, Aliyun says it previously used 40 traditional GPUs to process videos of central Hangzhou with a latency of 300ms. Now the task requires only four Hanguang 800 with a lower latency of 150ms.


Martin's Playtime with Tensorflow Lite / Dr Who image recognition

#artificialintelligence

Sign in to report inappropriate content. Digital Maker's Martin Evans has been experimenting with TensorFlow Lite on the Raspberry Pi 4 to recognise Dr Who character shapes. This is a short video of the Pi camera recognising a Dalek & a Cyberman, with the output going to an Ada Fruit Display Screen.