Image Matching
NeuReg: Domain-invariant 3D Image Registration on Human and Mouse Brains
Medical brain imaging relies heavily on image registration to accurately curate structural boundaries of brain features for various healthcare applications. Deep learning models have shown remarkable performance in image registration in recent years. Still, they often struggle to handle the diversity of 3D brain volumes, challenged by their structural and contrastive variations and their imaging domains. In this work, we present NeuReg, a Neuro-inspired 3D image registration architecture with the feature of domain invariance. NeuReg generates domain-agnostic representations of imaging features and incorporates a shifting window-based Swin Transformer block as the encoder. This enables our model to capture the variations across brain imaging modalities and species. We demonstrate a new benchmark in multi-domain publicly available datasets comprising human and mouse 3D brain volumes. Extensive experiments reveal that our model (NeuReg) outperforms the existing baseline deep learning-based image registration models and provides a high-performance boost on cross-domain datasets, where models are trained on 'source-only' domain and tested on completely 'unseen' target domains. Our work establishes a new state-of-the-art for domain-agnostic 3D brain image registration, underpinned by Neuro-inspired Transformer-based architecture.
- North America > United States > Washington > King County > Redmond (0.04)
- Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Multi-modal deformable image registration using untrained neural networks
Nguyen, Quang Luong Nhat, Cao, Ruiming, Waller, Laura
Image registration techniques usually assume that the images to be registered are of a certain type (e.g. single- vs. multi-modal, 2D vs. 3D, rigid vs. deformable) and there lacks a general method that can work for data under all conditions. We propose a registration method that utilizes neural networks for image representation. Our method uses untrained networks with limited representation capacity as an implicit prior to guide for a good registration. Unlike previous approaches that are specialized for specific data types, our method handles both rigid and non-rigid, as well as single- and multi-modal registration, without requiring changes to the model or objective function. We have performed a comprehensive evaluation study using a variety of datasets and demonstrated promising performance.
- North America > United States > California > Alameda County > Berkeley (0.14)
- Europe > Switzerland > Zürich > Zürich (0.05)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.66)
Accelerated Sub-Image Search For Variable-Size Patches Identification Based On Virtual Time Series Transformation And Segmentation
This paper addresses two tasks: (i) fixed-size objects such as hay bales are to be identified in an aerial image for a given reference image of the object, and (ii) variable-size patches such as areas on fields requiring spot spraying or other handling are to be identified in an image for a given small-scale reference image. Both tasks are related. The second differs in that identified sub-images similar to the reference image are further clustered before patches contours are determined by solving a traveling salesman problem. Both tasks are complex in that the exact number of similar sub-images is not known a priori. The main discussion of this paper is presentation of an acceleration mechanism for sub-image search that is based on a transformation of an image to multivariate time series along the RGB-channels and subsequent segmentation to reduce the 2D search space in the image. Two variations of the acceleration mechanism are compared to exhaustive search on diverse synthetic and real-world images. Quantitatively, proposed method results in solve time reductions of up to 2 orders of magnitude, while qualitatively delivering comparative visual results. Proposed method is neural network-free and does not use any image pre-processing.
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- Europe > Switzerland (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.64)
- (2 more...)
MoonMetaSync: Lunar Image Registration Analysis
Kumar, Ashutosh, Kaushal, Sarthak, Murthy, Shiv Vignesh
This paper compares scale-invariant (SIFT) and scale-variant (ORB) feature detection methods, alongside our novel feature detector, IntFeat, specifically applied to lunar imagery. We evaluate these methods using low (128x128) and high-resolution (1024x1024) lunar image patches, providing insights into their performance across scales in challenging extraterrestrial environments. IntFeat combines high-level features from SIFT and low-level features from ORB into a single vector space for robust lunar image registration. We introduce SyncVision, a Python package that compares lunar images using various registration methods, including SIFT, ORB, and IntFeat. Our analysis includes upscaling low-resolution lunar images using bi-linear and bi-cubic interpolation, offering a unique perspective on registration effectiveness across scales and feature detectors in lunar landscapes. This research contributes to computer vision and planetary science by comparing feature detection methods for lunar imagery and introducing a versatile tool for lunar image registration and evaluation, with implications for multi-resolution image analysis in space exploration applications.
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.86)
Recurrent Registration Neural Networks for Deformable Image Registration
Parametric spatial transformation models have been successfully applied to image registration tasks. In such models, the transformation of interest is parameterized by a fixed set of basis functions as for example B-splines. Each basis function is located on a fixed regular grid position among the image domain because the transformation of interest is not known in advance. As a consequence, not all basis functions will necessarily contribute to the final transformation which results in a non-compact representation of the transformation. For each element in the sequence, a local deformation defined by its position, shape, and weight is computed by our recurrent registration neural network.
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.64)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)
Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition
Vision Transformers (ViT) have achieved remarkable success in large-scale image recognition. They split every 2D image into a fixed number of patches, each of which is treated as a token. Generally, representing an image with more tokens would lead to higher prediction accuracy, while it also results in drastically increased computational cost. To achieve a decent trade-off between accuracy and speed, the number of tokens is empirically set to 16x16 or 14x14. In this paper, we argue that every image has its own characteristics, and ideally the token number should be conditioned on each individual input.
- Information Technology > Sensing and Signal Processing > Image Processing (0.76)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.63)
This Looks Like That: Deep Learning for Interpretable Image Recognition
When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or another. The mounting evidence for each of the classes helps us make our final decision. In this work, we introduce a deep network architecture -- prototypical part network (ProtoPNet), that reasons in a similar way: the network dissects the image by finding prototypical parts, and combines evidence from the prototypes to make a final classification. The model thus reasons in a way that is qualitatively similar to the way ornithologists, physicians, and others would explain to people on how to solve challenging image classification tasks. The network uses only image-level labels for training without any annotations for parts of images.
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.40)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)
Arbicon-Net: Arbitrary Continuous Geometric Transformation Networks for Image Registration
This paper concerns the undetermined problem of estimating geometric transformation between image pairs. Recent methods introduce deep neural networks to predict the controlling parameters of hand-crafted geometric transformation models (e.g. However, the low-dimension parametric models are incapable of estimating a highly complex geometric transform with limited flexibility to model the actual geometric deformation from image pairs. To address this issue, we present an end-to-end trainable deep neural networks, named Arbitrary Continuous Geometric Transformation Networks (Arbicon-Net), to directly predict the dense displacement field for pairwise image alignment. Arbicon-Net is generalized from training data to predict the desired arbitrary continuous geometric transformation in a data-driven manner for unseen new pair of images.
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.45)
Reviews: Bilevel Distance Metric Learning for Robust Image Recognition
Summary: The authors propose a bilevel method for metric learning, where the lower level is responsible for the extraction of discriminative features from the data based on a sparse coding scheme with graph regularization. This effectively detects their underlying geometric structure, and the upper level is a classic metric learning approach that utilizes the learned sparse coefficients. These two components are integrated into a joint optimization problem and an efficient optimization algorithm is developed accordingly. Hence, new data can be classified based on the learned dictionary and the corresponding metric. In the experiments the authors demonstrate the capabilities of the model to provide more discriminative features from high dimensional data, while being more robust to noise.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.63)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.40)
Reviews: A Simple Cache Model for Image Recognition
This paper presents a cache model to be used in image recognition tasks. The authors argue that class specific information can be retrieved from earlier layers of the network to improve the accuracy of an already trained model, without having to re-train of finetune. This is achieved by extracting and caching the activations of some layers along with the class at training time. At test time a similarity measure is used to calculate how far/close the input is compared to information stored in memory. Experiments show that performance is improved in CIFAR 10/100 and ImageNet.