AITopics | Image Matching

Collaborating Authors

Image Matching

News Overviews Instructional Materials AI-Alerts Classics

Spatially Covariant Image Registration with Text Prompts

Chen, Xiang, Liu, Min, Wang, Rongguang, Hu, Renjiu, Liu, Dongdong, Li, Gaolei, Zhang, Hang

arXiv.org Artificial IntelligenceFeb-5-2024

Medical images are often characterized by their structured anatomical representations and spatially inhomogeneous contrasts. Leveraging anatomical priors in neural networks can greatly enhance their utility in resource-constrained clinical settings. Prior research has harnessed such information for image segmentation, yet progress in deformable image registration has been modest. Our work introduces textSCF, a novel method that integrates spatially covariant filters and textual anatomical prompts encoded by visual-language models, to fill this gap. This approach optimizes an implicit function that correlates text embeddings of anatomical regions to filter weights, relaxing the typical translation-invariance constraint of convolutional operations. TextSCF not only boosts computational efficiency but can also retain or improve registration accuracy. By capturing the contextual interplay between anatomical regions, it offers impressive inter-regional transferability and the ability to preserve structural discontinuities during registration. TextSCF's performance has been rigorously tested on inter-subject brain MRI and abdominal CT registration tasks, outperforming existing state-of-the-art models in the MICCAI Learn2Reg 2021 challenge and leading the leaderboard. In abdominal registrations, textSCF's larger model variant improved the Dice score by 11.3% over the second-best model, while its smaller variant maintained similar accuracy but with an 89.13% reduction in network parameters and a 98.34\% decrease in computational operations.

deformation field, registration, textscf, (14 more...)

arXiv.org Artificial Intelligence

2311.15607

Country:

North America > United States > New York (0.04)
South America > Peru > Lima Department > Lima Province > Lima (0.04)
North America > United States > Pennsylvania (0.04)
(6 more...)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)
Research Report > Promising Solution (0.68)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Health Care Technology (0.88)
Health & Medicine > Therapeutic Area > Neurology (0.88)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

A Tournament of Transformation Models: B-Spline-based vs. Mesh-based Multi-Objective Deformable Image Registration

Andreadis, Georgios, Mulder, Joas I., Bouter, Anton, Bosman, Peter A. N., Alderliesten, Tanja

arXiv.org Artificial IntelligenceJan-30-2024

The transformation model is an essential component of any deformable image registration approach. It provides a representation of physical deformations between images, thereby defining the range and realism of registrations that can be found. Two types of transformation models have emerged as popular choices: B-spline models and mesh models. Although both models have been investigated in detail, a direct comparison has not yet been made, since the models are optimized using very different optimization methods in practice. B-spline models are predominantly optimized using gradient-descent methods, while mesh models are typically optimized using finite-element method solvers or evolutionary algorithms. Multi-objective optimization methods, which aim to find a diverse set of high-quality trade-off registrations, are increasingly acknowledged to be important in deformable image registration. Since these methods search for a diverse set of registrations, they can provide a more complete picture of the capabilities of different transformation models, making them suitable for a comparison of models. In this work, we conduct the first direct comparison between B-spline and mesh transformation models, by optimizing both models with the same state-of-the-art multi-objective optimization method, the Multi-Objective Real-Valued Gene-pool Optimal Mixing Evolutionary Algorithm (MO-RV-GOMEA). The combination with B-spline transformation models, moreover, is novel. We experimentally compare both models on two different registration problems that are both based on pelvic CT scans of cervical cancer patients, featuring large deformations. Our results, on three cervical cancer patients, indicate that the choice of transformation model can have a profound impact on the diversity and quality of achieved registration outcomes.

b-spline, registration, transformation model, (13 more...)

arXiv.org Artificial Intelligence

2401.16867

Country:

Europe > Netherlands > South Holland > Leiden (0.05)
Europe > Netherlands > South Holland > Delft (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.83)

Add feedback

On Image Search in Histopathology

Tizhoosh, H. R., Pantanowitz, Liron

arXiv.org Artificial IntelligenceJan-14-2024

Pathology images of histopathology can be acquired from camera-mounted microscopes or whole slide scanners. Utilizing similarity calculations to match patients based on these images holds significant potential in research and clinical contexts. Recent advancements in search technologies allow for nuanced quantification of cellular structures across diverse tissue types, facilitating comparisons and enabling inferences about diagnosis, prognosis, and predictions for new patients when compared against a curated database of diagnosed and treated cases. In this paper, we comprehensively review the latest developments in image search technologies for histopathology, offering a concise overview tailored for computational pathology researchers seeking effective, fast and efficient image search methods in their work.

image retrieval, image search, yottixel, (14 more...)

arXiv.org Artificial Intelligence

2401.08699

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Minnesota > Olmsted County > Rochester (0.04)
Europe > United Kingdom > England (0.04)

Genre:

Research Report (0.82)
Overview (0.68)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(5 more...)

Add feedback

Iterative PnP and its application in 3D-2D vascular image registration for robot navigation

Song, Jingwei, Yang, Keke, Zhang, Zheng, Li, Meng, Cao, Tuoyu, Ghaffari, Maani

arXiv.org Artificial IntelligenceJan-11-2024

This paper reports on a new real-time robot-centered 3D-2D vascular image alignment algorithm, which is robust to outliers and can align nonrigid shapes. Few works have managed to achieve both real-time and accurate performance for vascular intervention robots. This work bridges high-accuracy 3D-2D registration techniques and computational efficiency requirements in intervention robot applications. We categorize centerline-based vascular 3D-2D image registration problems as an iterative Perspective-n-Point (PnP) problem and propose to use the Levenberg-Marquardt solver on the Lie manifold. Then, the recently developed Reproducing Kernel Hilbert Space (RKHS) algorithm is introduced to overcome the ``big-to-small'' problem in typical robotic scenarios. Finally, an iterative reweighted least squares is applied to solve RKHS-based formulation efficiently. Experiments indicate that the proposed algorithm processes registration over 50 Hz (rigid) and 20 Hz (nonrigid) and obtains competing registration accuracy similar to other works. Results indicate that our Iterative PnP is suitable for future vascular intervention robot applications.

algorithm, iterative pnp, registration, (15 more...)

arXiv.org Artificial Intelligence

2310.12551

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
North America > Canada > Quebec > Capitale-Nationale Region > Quebec City (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

REBUS: A Robust Evaluation Benchmark of Understanding Symbols

Gritsevskiy, Andrew, Panickssery, Arjun, Kirtland, Aaron, Kauffman, Derik, Gundlach, Hans, Gritsevskaya, Irina, Cavanagh, Joe, Chiang, Jonathan, La Roux, Lydia, Hung, Michelle

arXiv.org Artificial IntelligenceJan-10-2024

We propose a new benchmark evaluating the performance of multimodal large language models on rebus puzzles. The dataset covers 333 original examples of image-based wordplay, cluing 13 categories such as movies, composers, major cities, and food. To achieve good performance on the benchmark of identifying the clued word or phrase, models must combine image recognition and string manipulation with hypothesis testing, multi-step reasoning, and an understanding of human cognition, making for a complex, multimodal evaluation of capabilities. We find that proprietary models such as GPT-4V and Gemini Pro significantly outperform all other tested models. However, even the best model has a final accuracy of just 24%, highlighting the need for substantial improvements in reasoning. Further, models rarely understand all parts of a puzzle, and are almost always incapable of retroactively explaining the correct answer. Our benchmark can therefore be used to identify major shortcomings in the knowledge and reasoning of multimodal large language models.

language model, puzzle, wang, (14 more...)

arXiv.org Artificial Intelligence

2401.05604

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.34)

Add feedback

Deep learning in medical image registration: introduction and survey

Hammoudeh, Ahmad, Dupont, Stéphane

arXiv.org Artificial IntelligenceJan-10-2024

Image registration (IR) is a process that deforms images to align them with respect to a reference space, making it easier for medical practitioners to examine various medical images in a standardized reference frame, such as having the same rotation and scale. This document introduces image registration using a simple numeric example. It provides a definition of image registration along with a space-oriented symbolic representation. This review covers various aspects of image transformations, including affine, deformable, invertible, and bidirectional transformations, as well as medical image registration algorithms such as Voxelmorph, Demons, SyN, Iterative Closest Point, and SynthMorph. It also explores atlas-based registration and multistage image registration techniques, including coarse-fine and pyramid approaches. Furthermore, this survey paper discusses medical image registration taxonomies, datasets, evaluation measures, such as correlation-based metrics, segmentation-based metrics, processing time, and model size. It also explores applications in image-guided surgery, motion tracking, and tumor diagnosis. Finally, the document addresses future research directions, including the further development of transformers.

image registration, medical image registration, registration, (14 more...)

arXiv.org Artificial Intelligence

2309.00727

Country:

Europe > Jersey (0.14)
North America > United States > New Jersey (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > France (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LF-ViT: Reducing Spatial Redundancy in Vision Transformer for Efficient Image Recognition

Hu, Youbing, Cheng, Yun, Lu, Anqi, Cao, Zhiqiang, Wei, Dawei, Liu, Jie, Li, Zhijun

arXiv.org Artificial IntelligenceJan-7-2024

The Vision Transformer (ViT) excels in accuracy when handling high-resolution images, yet it confronts the challenge of significant spatial redundancy, leading to increased computational and memory requirements. To address this, we present the Localization and Focus Vision Transformer (LF-ViT). This model operates by strategically curtailing computational demands without impinging on performance. In the Localization phase, a reduced-resolution image is processed; if a definitive prediction remains elusive, our pioneering Neighborhood Global Class Attention (NGCA) mechanism is triggered, effectively identifying and spotlighting class-discriminative regions based on initial findings. Subsequently, in the Focus phase, this designated region is used from the original image to enhance recognition. Uniquely, LF-ViT employs consistent parameters across both phases, ensuring seamless end-to-end optimization. Our empirical tests affirm LF-ViT's prowess: it remarkably decreases Deit-S's FLOPs by 63\% and concurrently amplifies throughput twofold. Code of this project is at https://github.com/edgeai1/LF-ViT.git.

class-discriminative region, lf-vit, transformer, (14 more...)

arXiv.org Artificial Intelligence

2402.00033

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.41)

Add feedback

Image recognition accuracy: An unseen challenge confounding today's AI

AIHubJan-3-2024, 14:41:13 GMT

MVT, minimum viewing time, is a dataset difficulty metric measuring the minimum presentation time required for an image to be recognized. Researchers hope this metric will be used to evaluate models' performance and biological plausibility and guide the creation of new more difficult datasets, leading to new computer vision techniques that perform better in real life. Imagine you are scrolling through the photos on your phone and you come across an image that at first you can't recognize. It looks like maybe something fuzzy on the couch; could it be a pillow or a coat? That ball of fluff is your friend's cat, Mocha.

artificial intelligence, machine learning, pattern recognition, (16 more...)

AIHub

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.40)

Add feedback

BusReF: Infrared-Visible images registration and fusion focus on reconstructible area using one set of features

Zhang, Zeyang, Li, Hui, Xu, Tianyang, Wu, Xiaojun, Kittler, Josef

arXiv.org Artificial IntelligenceDec-30-2023

In a scenario where multi-modal cameras are operating together, the problem of working with non-aligned images cannot be avoided. Yet, existing image fusion algorithms rely heavily on strictly registered input image pairs to produce more precise fusion results, as a way to improve the performance of downstream high-level vision tasks. In order to relax this assumption, one can attempt to register images first. However, the existing methods for registering multiple modalities have limitations, such as complex structures and reliance on significant semantic information. This paper aims to address the problem of image registration and fusion in a single framework, called BusRef. We focus on Infrared-Visible image registration and fusion task (IVRF). In this framework, the input unaligned image pairs will pass through three stages: Coarse registration, Fine registration and Fusion. It will be shown that the unified approach enables more robust IVRF. We also propose a novel training and evaluation strategy, involving the use of masks to reduce the influence of non-reconstructible regions on the loss functions, which greatly improves the accuracy and robustness of the fusion task. Last but not least, a gradient-aware fusion network is designed to preserve the complementary information. The advanced performance of this algorithm is demonstrated by

fusion, image registration, registration, (17 more...)

arXiv.org Artificial Intelligence

2401.00285

Country:

Europe > United Kingdom > England > Surrey > Guildford (0.04)
Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
Asia > China > Jiangsu Province (0.04)

Genre: Research Report (0.64)

Industry: Media > Photography (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Deformable Image Registration with Stochastically Regularized Biomechanical Equilibrium

Alvarez, Pablo, Cotin, Stéphane

arXiv.org Artificial IntelligenceDec-22-2023

Numerous regularization methods for deformable image registration aim at enforcing smooth transformations, but are difficult to tune-in a priori and lack a clear physical basis. Physically inspired strategies have emerged, offering a sound theoretical basis, but still necessitating complex discretization and resolution schemes. This study introduces a regularization strategy that does not require discretization, making it compatible with current registration frameworks, while retaining the benefits of physically motivated regularization for medical image registration. The proposed method performs favorably in both synthetic and real datasets, exhibiting an accuracy comparable to current state-of-the-art methods.

dataset, registration, regularization, (15 more...)

arXiv.org Artificial Intelligence

2312.14987

Country: Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.05)

Genre: Research Report (0.70)

Industry: Health & Medicine (0.49)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.86)

Add feedback