Collaborating Authors


FisheyeHDK: Hyperbolic Deformable Kernel Learning for Ultra-Wide Field-of-View Image Recognition


Conventional convolution neural networks (CNNs) trained on narrow Field-of-View (FoV) images are the state-of-the-art approaches for object recognition tasks. Some methods proposed the adaptation of CNNs to ultra-wide FoV images by learning deformable kernels. However, they are limited by the Euclidean geometry and their accuracy degrades under strong distortions caused by fisheye projections. In this work, we demonstrate that learning the shape of convolution kernels in non-Euclidean spaces is better than existing deformable kernel methods. In particular, we propose a new approach that learns deformable kernel parameters (positions) in hyperbolic space. FisheyeHDK is a hybrid CNN architecture combining hyperbolic and Euclidean convolution layers for positions and features learning. First, we provide an intuition of hyperbolic space for wide FoV images. Using synthetic distortion profiles, we demonstrate the effectiveness of our approach. We select two datasets - Cityscapes and BDD100K 2020 - of perspective images which we transform to fisheye equivalents at different scaling factors (analog to focal lengths). Finally, we provide an experiment on data collected by a real fisheye camera. Validations and experiments show that our approach improves existing deformable kernel methods for CNN adaptation on fisheye images.

Fashion Image Search Engine - AI Summary


Introduction Computers are able to see, hear and learn. Welcome to the future.Dave Waters In this post, I want to talk about a computer vision use case, it's called Content Based Image Retrieval or CBIR in short. In simple words, retrieving images relevant to the user needs from image databases on the basis of low-level visual features. Image Search…

Multi-modal unsupervised brain image registration using edge maps Artificial Intelligence

Diffeomorphic deformable multi-modal image registration is a challenging task which aims to bring images acquired by different modalities to the same coordinate space and at the same time to preserve the topology and the invertibility of the transformation. Recent research has focused on leveraging deep learning approaches for this task as these have been shown to achieve competitive registration accuracy while being computationally more efficient than traditional iterative registration methods. In this work, we propose a simple yet effective unsupervised deep learning-based {\em multi-modal} image registration approach that benefits from auxiliary information coming from the gradient magnitude of the image, i.e. the image edges, during the training. The intuition behind this is that image locations with a strong gradient are assumed to denote a transition of tissues, which are locations of high information value able to act as a geometry constraint. The task is similar to using segmentation maps to drive the training, but the edge maps are easier and faster to acquire and do not require annotations. We evaluate our approach in the context of registering multi-modal (T1w to T2w) magnetic resonance (MR) brain images of different subjects using three different loss functions that are said to assist multi-modal registration, showing that in all cases the auxiliary information leads to better results without compromising the runtime.



Image and face recognition platforms and solutions have been a major focus in the technology sector over the past two decades. Images and face recognition technology are used in many industries, including healthcare, security, e-commerce and security. This has resulted in remarkable progress. Experts believe this technology can perform at or even surpass human-level in many medical diagnoses and security domains. Many brands now use image recognition technology to harness the intersection of visual analytics and text to understand the industry and target audience, and to deploy visual intelligence to drive meaningful communications.

An Intro to AI Image Recognition and Image Generation


Artificial intelligence, undoubtedly, is altering the ways we live, work, and even create. It enhances productivity, quality, and speed of work. Image recognition that used to be tedious work has now been performed by AI-enabled machines. The image-generating feature of artificial intelligence has opened ways for people to go in directions they have never heard of.

Top Face and Image Recognition Apps to Follow in December 2021


With the development of technology, Image recognition has convincingly become an integral part of our life. There are diverse kinds of products and applications in the market now, intended to analyze and recognize specific objects in graphics. Biometrics is now a critical feature utilized by firms and even individuals for their security. This concept now has complete application and helps control false arrests, diagnose genetic disorders and reduce malware attacks, cybercrimes, etc. Each application varies with its performance, working methods, applications, etc. Users can choose the product based on our requirements.

TransMorph: Transformer for unsupervised medical image registration Artificial Intelligence

In the last decade, convolutional neural networks (ConvNets) have been a major focus of research in medical image analysis. However, the performances of ConvNets may be limited by a lack of explicit consideration of the long-range spatial relationships in an image. Recently Vision Transformer architectures have been proposed to address the shortcomings of ConvNets and have produced state-of-the-art performances in many medical imaging applications. Transformers may be a strong candidate for image registration because their unlimited receptive field enables a more precise comprehension of the spatial correspondence between moving and fixed images. Here, we present TransMorph, a hybrid Transformer-ConvNet model for volumetric medical image registration. This paper also presents diffeomorphic and Bayesian variants of TransMorph: the diffeomorphic variants ensure the topology-preserving deformations, and the Bayesian variant produces a well-calibrated registration uncertainty estimate. We extensively validated the proposed models using 3D medical images from three applications: inter-patient and atlas-to-patient brain MRI registration and phantom-to-CT registration. The proposed models are evaluated in comparison to a variety of existing registration methods and Transformer architectures. Qualitative and quantitative results demonstrate that the proposed Transformer-based model leads to a substantial performance improvement over the baseline methods, confirming the effectiveness of Transformers for medical image registration.

Use Cases and Roll-Out Tips for Image Recognition in Retail


Heavily shattered by the pandemic, the retail sector is on the lookout for innovation. Among the many technologies retailers focus on, artificial intelligence is an undeniable leader. The market of artificial intelligence solutions for retail is projected to reach $23.32 billion by 2027, quite a leap compared to $5.06 billion in 2021. Within AI, computer vision and image recognition have become notable areas of interest for the retail sector -- the global market of retail image recognition software is expected to grow at a CAGR of 22% and attain the value of $3.7 billion by 2025. Bringing image recognition into their technology mixes, retailers hope to optimize inventories, simplify checkouts, and boost customer experience.

Image Recognition In WhatsApp Chatbot - Using Twilio & Azure Function App


This article is the final article in the 3-part series for Image Recognition in WhatsApp Chatbot. The first article, LUIS – Create a Conversation App discussed more on creating a service in Azure for LUIS. The second article Image Recognition in WhatsApp Chatbot - Using Azure AI continued on to create models and Image Recognition service on Visual Studio. This last article focuses on using Twilio and Azure function app to develop the WhatsApp Chatbot. Twilio offers the service of cloud communication platform (CPaaS) to enable developers to make and receive phone calls programmatically, send and receive messages in text format as well as perform numerous other communication functionalities through web service APIs.