AITopics | Pattern Recognition

Collaborating Authors

Pattern Recognition

"... the research area that studies the operation and design of systems that recognize patterns in data." It includes statistical methods like discriminant analysis, feature extraction, error estimation, cluster analysis.
– Pattern Recognition Laboratory at Delft University of Technology

News Overviews Instructional Materials AI-Alerts Classics

REBUS: A Robust Evaluation Benchmark of Understanding Symbols

Gritsevskiy, Andrew, Panickssery, Arjun, Kirtland, Aaron, Kauffman, Derik, Gundlach, Hans, Gritsevskaya, Irina, Cavanagh, Joe, Chiang, Jonathan, La Roux, Lydia, Hung, Michelle

arXiv.org Artificial IntelligenceJan-10-2024

We propose a new benchmark evaluating the performance of multimodal large language models on rebus puzzles. The dataset covers 333 original examples of image-based wordplay, cluing 13 categories such as movies, composers, major cities, and food. To achieve good performance on the benchmark of identifying the clued word or phrase, models must combine image recognition and string manipulation with hypothesis testing, multi-step reasoning, and an understanding of human cognition, making for a complex, multimodal evaluation of capabilities. We find that proprietary models such as GPT-4V and Gemini Pro significantly outperform all other tested models. However, even the best model has a final accuracy of just 24%, highlighting the need for substantial improvements in reasoning. Further, models rarely understand all parts of a puzzle, and are almost always incapable of retroactively explaining the correct answer. Our benchmark can therefore be used to identify major shortcomings in the knowledge and reasoning of multimodal large language models.

language model, puzzle, wang, (14 more...)

arXiv.org Artificial Intelligence

2401.05604

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.34)

Add feedback

Deep learning in medical image registration: introduction and survey

Hammoudeh, Ahmad, Dupont, Stéphane

arXiv.org Artificial IntelligenceJan-10-2024

Image registration (IR) is a process that deforms images to align them with respect to a reference space, making it easier for medical practitioners to examine various medical images in a standardized reference frame, such as having the same rotation and scale. This document introduces image registration using a simple numeric example. It provides a definition of image registration along with a space-oriented symbolic representation. This review covers various aspects of image transformations, including affine, deformable, invertible, and bidirectional transformations, as well as medical image registration algorithms such as Voxelmorph, Demons, SyN, Iterative Closest Point, and SynthMorph. It also explores atlas-based registration and multistage image registration techniques, including coarse-fine and pyramid approaches. Furthermore, this survey paper discusses medical image registration taxonomies, datasets, evaluation measures, such as correlation-based metrics, segmentation-based metrics, processing time, and model size. It also explores applications in image-guided surgery, motion tracking, and tumor diagnosis. Finally, the document addresses future research directions, including the further development of transformers.

image registration, medical image registration, registration, (14 more...)

arXiv.org Artificial Intelligence

2309.00727

Country:

Europe > Jersey (0.14)
North America > United States > New Jersey (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > France (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluating Gesture Recognition in Virtual Reality

Sabbella, Sandeep Reddy, Kaszuba, Sara, Leotta, Francesco, Serrarens, Pascal, Nardi, Daniele

arXiv.org Artificial IntelligenceJan-9-2024

Human-Robot Interaction (HRI) has become increasingly important as robots are being integrated into various aspects of daily life. One key aspect of HRI is gesture recognition, which allows robots to interpret and respond to human gestures in real-time. Gesture recognition plays an important role in non-verbal communication in HRI. To this aim, there is ongoing research on how such non-verbal communication can strengthen verbal communication and improve the system's overall efficiency, thereby enhancing the user experience with the robot. However, several challenges need to be addressed in gesture recognition systems, which include data generation, transferability, scalability, generalizability, standardization, and lack of benchmarking of the gestural systems. In this preliminary paper, we want to address the challenges of data generation using virtual reality simulations and standardization issues by presenting gestures to some commands that can be used as a standard in ground robots.

gesture recognition, recognition, simulation, (14 more...)

arXiv.org Artificial Intelligence

2401.04545

Country:

Europe > Sweden > Stockholm > Stockholm (0.05)
Europe > Italy > Lazio > Rome (0.05)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Vision > Gesture Recognition (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(2 more...)

Add feedback

LF-ViT: Reducing Spatial Redundancy in Vision Transformer for Efficient Image Recognition

Hu, Youbing, Cheng, Yun, Lu, Anqi, Cao, Zhiqiang, Wei, Dawei, Liu, Jie, Li, Zhijun

arXiv.org Artificial IntelligenceJan-7-2024

The Vision Transformer (ViT) excels in accuracy when handling high-resolution images, yet it confronts the challenge of significant spatial redundancy, leading to increased computational and memory requirements. To address this, we present the Localization and Focus Vision Transformer (LF-ViT). This model operates by strategically curtailing computational demands without impinging on performance. In the Localization phase, a reduced-resolution image is processed; if a definitive prediction remains elusive, our pioneering Neighborhood Global Class Attention (NGCA) mechanism is triggered, effectively identifying and spotlighting class-discriminative regions based on initial findings. Subsequently, in the Focus phase, this designated region is used from the original image to enhance recognition. Uniquely, LF-ViT employs consistent parameters across both phases, ensuring seamless end-to-end optimization. Our empirical tests affirm LF-ViT's prowess: it remarkably decreases Deit-S's FLOPs by 63\% and concurrently amplifies throughput twofold. Code of this project is at https://github.com/edgeai1/LF-ViT.git.

class-discriminative region, lf-vit, transformer, (14 more...)

arXiv.org Artificial Intelligence

2402.00033

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.41)

Add feedback

View-based Explanations for Graph Neural Networks

Chen, Tingyang, Qiu, Dazhuo, Wu, Yinghui, Khan, Arijit, Ke, Xiangyu, Gao, Yunjun

arXiv.org Artificial IntelligenceJan-7-2024

Generating explanations for graph neural networks (GNNs) has been studied to understand their behavior in analytical tasks such as graph classification. Existing approaches aim to understand the overall results of GNNs rather than providing explanations for specific class labels of interest, and may return explanation structures that are hard to access, nor directly queryable.We propose GVEX, a novel paradigm that generates Graph Views for EXplanation. (1) We design a two-tier explanation structure called explanation views. An explanation view consists of a set of graph patterns and a set of induced explanation subgraphs. Given a database G of multiple graphs and a specific class label l assigned by a GNN-based classifier M, it concisely describes the fraction of G that best explains why l is assigned by M. (2) We propose quality measures and formulate an optimization problem to compute optimal explanation views for GNN explanation. We show that the problem is $\Sigma^2_P$-hard. (3) We present two algorithms. The first one follows an explain-and-summarize strategy that first generates high-quality explanation subgraphs which best explain GNNs in terms of feature influence maximization, and then performs a summarization step to generate patterns. We show that this strategy provides an approximation ratio of 1/2. Our second algorithm performs a single-pass to an input node stream in batches to incrementally maintain explanation views, having an anytime quality guarantee of 1/4 approximation. Using real-world benchmark data, we experimentally demonstrate the effectiveness, efficiency, and scalability of GVEX. Through case studies, we showcase the practical applications of GVEX.

explanation subgraph, node, subgraph, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3639295

2401.02086

Country:

Europe > Denmark > North Jutland > Aalborg (0.04)
Asia > China > Zhejiang Province > Ningbo (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > Ohio > Cuyahoga County > Cleveland (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Tackling Electrode Shift In Gesture Recognition with HD-EMG Electrode Subsets

Pereira, Joao, Chalatsis, Dimitrios, Hodossy, Balint, Farina, Dario

arXiv.org Artificial IntelligenceJan-5-2024

sEMG pattern recognition algorithms have been explored extensively in decoding movement intent, yet are known to be vulnerable to changing recording conditions, exhibiting significant drops in performance across subjects, and even across sessions. Multi-channel surface EMG, also referred to as high-density sEMG (HD-sEMG) systems, have been used to improve performance with the information collected through the use of additional electrodes. However, a lack of robustness is ever present due to limited datasets and the difficulties in addressing sources of variability, such as electrode placement. In this study, we propose training on a collection of input channel subsets and augmenting our training distribution with data from different electrode locations, simultaneously targeting electrode shift and reducing input dimensionality. Our method increases robustness against electrode shift and results in significantly higher intersession performance across subjects and classification algorithms.

channel subset, electrode shift, subset, (15 more...)

arXiv.org Artificial Intelligence

2401.02773

Genre: Research Report > New Finding (0.90)

Industry: Health & Medicine (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)

Add feedback

Survey on Publicly Available Sinhala Natural Language Processing Tools and Research

de Silva, Nisansa

arXiv.org Artificial IntelligenceJan-4-2024

Sinhala is the native language of the Sinhalese people who make up the largest ethnic group of Sri Lanka. The language belongs to the globe-spanning language tree, Indo-European. However, due to poverty in both linguistic and economic capital, Sinhala, in the perspective of Natural Language Processing tools and research, remains a resource-poor language which has neither the economic drive its cousin English has nor the sheer push of the law of numbers a language such as Chinese has. A number of research groups from Sri Lanka have noticed this dearth and the resultant dire need for proper tools and research for Sinhala natural language processing. However, due to various reasons, these attempts seem to lack coordination and awareness of each other. The objective of this paper is to fill that gap of a comprehensive literature survey of the publicly available Sinhala natural language tools and research so that the researchers working in this field can better utilize contributions of their peers. As such, we shall be uploading this paper to arXiv and perpetually update it periodically to reflect the advances made in the field.

ieee, international conference, sinhala, (14 more...)

arXiv.org Artificial Intelligence

1906.02358

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Finland > Uusimaa > Helsinki (0.04)
North America > United States > New York (0.04)
(13 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Media > News (1.00)
Information Technology > Services (1.00)
Education (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
(12 more...)

Add feedback

Image recognition accuracy: An unseen challenge confounding today's AI

AIHubJan-3-2024, 14:41:13 GMT

MVT, minimum viewing time, is a dataset difficulty metric measuring the minimum presentation time required for an image to be recognized. Researchers hope this metric will be used to evaluate models' performance and biological plausibility and guide the creation of new more difficult datasets, leading to new computer vision techniques that perform better in real life. Imagine you are scrolling through the photos on your phone and you come across an image that at first you can't recognize. It looks like maybe something fuzzy on the couch; could it be a pillow or a coat? That ball of fluff is your friend's cat, Mocha.

artificial intelligence, machine learning, pattern recognition, (16 more...)

AIHub

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.40)

Add feedback

A Review of Findings from Neuroscience and Cognitive Psychology as Possible Inspiration for the Path to Artificial General Intelligence

Leon, Florin

arXiv.org Artificial IntelligenceJan-3-2024

This review aims to contribute to the quest for artificial general intelligence by examining neuroscience and cognitive psychology methods for potential inspiration. Despite the impressive advancements achieved by deep learning models in various domains, they still have shortcomings in abstract reasoning and causal understanding. Such capabilities should be ultimately integrated into artificial intelligence systems in order to surpass data-driven limitations and support decision making in a way more similar to human intelligence. This work is a vertical review that attempts a wide-ranging exploration of brain function, spanning from lower-level biological neurons, spiking neural networks, and neuronal ensembles to higher-level concepts such as brain anatomy, vector symbolic architectures, cognitive and categorization models, and cognitive architectures. The hope is that these concepts may offer insights for solutions in artificial general intelligence.

artificial general intelligence, cambridge university press, holographic reduced representation, (16 more...)

arXiv.org Artificial Intelligence

2401.10904

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.13)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.13)
North America > Canada > Ontario > Toronto (0.13)
(10 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(6 more...)

Add feedback

BusReF: Infrared-Visible images registration and fusion focus on reconstructible area using one set of features

Zhang, Zeyang, Li, Hui, Xu, Tianyang, Wu, Xiaojun, Kittler, Josef

arXiv.org Artificial IntelligenceDec-30-2023

In a scenario where multi-modal cameras are operating together, the problem of working with non-aligned images cannot be avoided. Yet, existing image fusion algorithms rely heavily on strictly registered input image pairs to produce more precise fusion results, as a way to improve the performance of downstream high-level vision tasks. In order to relax this assumption, one can attempt to register images first. However, the existing methods for registering multiple modalities have limitations, such as complex structures and reliance on significant semantic information. This paper aims to address the problem of image registration and fusion in a single framework, called BusRef. We focus on Infrared-Visible image registration and fusion task (IVRF). In this framework, the input unaligned image pairs will pass through three stages: Coarse registration, Fine registration and Fusion. It will be shown that the unified approach enables more robust IVRF. We also propose a novel training and evaluation strategy, involving the use of masks to reduce the influence of non-reconstructible regions on the loss functions, which greatly improves the accuracy and robustness of the fusion task. Last but not least, a gradient-aware fusion network is designed to preserve the complementary information. The advanced performance of this algorithm is demonstrated by

fusion, image registration, registration, (17 more...)

arXiv.org Artificial Intelligence

2401.00285

Country:

Europe > United Kingdom > England > Surrey > Guildford (0.04)
Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
Asia > China > Jiangsu Province (0.04)

Genre: Research Report (0.64)

Industry: Media > Photography (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback