AITopics | Pattern Recognition

Collaborating Authors

Pattern Recognition

"... the research area that studies the operation and design of systems that recognize patterns in data." It includes statistical methods like discriminant analysis, feature extraction, error estimation, cluster analysis.
– Pattern Recognition Laboratory at Delft University of Technology

News Overviews Instructional Materials AI-Alerts Classics

Phi: Leveraging Pattern-based Hierarchical Sparsity for High-Efficiency Spiking Neural Networks

Wei, Chiyue, Duan, Bowen, Guo, Cong, Zhang, Jingyang, Song, Qingyue, Li, Hai "Helen", Chen, Yiran

arXiv.org Artificial IntelligenceMay-19-2025

Spiking Neural Networks (SNNs) are gaining attention for their energy efficiency and biological plausibility, utilizing 0-1 activation sparsity through spike-driven computation. While existing SNN accelerators exploit this sparsity to skip zero computations, they often overlook the unique distribution patterns inherent in binary activations. In this work, we observe that particular patterns exist in spike activations, which we can utilize to reduce the substantial computation of SNN models. Based on these findings, we propose a novel \textbf{pattern-based hierarchical sparsity} framework, termed \textbf{\textit{Phi}}, to optimize computation. \textit{Phi} introduces a two-level sparsity hierarchy: Level 1 exhibits vector-wise sparsity by representing activations with pre-defined patterns, allowing for offline pre-computation with weights and significantly reducing most runtime computation. Level 2 features element-wise sparsity by complementing the Level 1 matrix, using a highly sparse matrix to further reduce computation while maintaining accuracy. We present an algorithm-hardware co-design approach. Algorithmically, we employ a k-means-based pattern selection method to identify representative patterns and introduce a pattern-aware fine-tuning technique to enhance Level 2 sparsity. Architecturally, we design \textbf{\textit{Phi}}, a dedicated hardware architecture that efficiently processes the two levels of \textit{Phi} sparsity on the fly. Extensive experiments demonstrate that \textit{Phi} achieves a $3.45\times$ speedup and a $4.93\times$ improvement in energy efficiency compared to state-of-the-art SNN accelerators, showcasing the effectiveness of our framework in optimizing SNN computation.

machine learning, pattern recognition, sparsity, (17 more...)

arXiv.org Artificial Intelligence

2505.10909

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.16)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Flexible Graph Similarity Computation With A Proactive Optimization Strategy

Liu, Zhouyang, Liu, Ning, Chen, Yixin, He, Jiezhong, Li, Dongsheng

arXiv.org Artificial IntelligenceMay-16-2025

Graph Edit Distance (GED) offers a principled and flexible measure of graph similarity, as it quantifies the minimum cost needed to transform one graph into another with customizable edit operation costs. Despite recent learning-based efforts to approximate GED via vector space representations, existing methods struggle with adapting to varying operation costs. Furthermore, they suffer from inefficient, reactive mapping refinements due to reliance on isolated node-level distance as guidance. To address these issues, we propose GEN, a novel learning-based approach for flexible GED approximation. GEN addresses the varying costs adaptation by integrating operation costs prior to match establishment, enabling mappings to dynamically adapt to cost variations. Furthermore, GEN introduces a proactive guidance optimization strategy that captures graph-level dependencies between matches, allowing informed matching decisions in a single step without costly iterative refinements. Extensive evaluations on real-world and synthetic datasets demonstrate that GEN achieves up to 37.8% reduction in GED approximation error and 72.7% reduction in inference time compared with state-of-the-art methods, while consistently maintaining robustness under diverse cost settings and graph sizes.

artificial intelligence, machine learning, pattern recognition, (18 more...)

arXiv.org Artificial Intelligence

2504.06533

Country:

Asia > China (0.46)
North America > United States > California (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
(2 more...)

Add feedback

Improving Generalization of Medical Image Registration Foundation Model

Hu, Jing, Yu, Kaiwei, Xian, Hongjiang, Hu, Shu, Wang, Xin

arXiv.org Artificial IntelligenceMay-13-2025

Deformable registration is a fundamental task in medical image processing, aiming to achieve precise alignment by establishing nonlinear correspondences between images. Traditional methods offer good adaptability and interpretability but are limited by computational efficiency. Although deep learning approaches have significantly improved registration speed and accuracy, they often lack flexibility and generalizability across different datasets and tasks. In recent years, foundation models have emerged as a promising direction, leveraging large and diverse datasets to learn universal features and transformation patterns for image registration, thus demonstrating strong cross-task transferability. However, these models still face challenges in generalization and robustness when encountering novel anatomical structures, varying imaging conditions, or unseen modalities. To address these limitations, this paper incorporates Sharpness-Aware Minimization (SAM) into foundation models to enhance their generalization and robustness in medical image registration. By optimizing the flatness of the loss landscape, SAM improves model stability across diverse data distributions and strengthens its ability to handle complex clinical scenarios. Experimental results show that foundation models integrated with SAM achieve significant improvements in cross-dataset registration performance, offering new insights for the advancement of medical image registration technology. Our code is available at https://github.com/Promise13/fm_sam}{https://github.com/Promise13/fm\_sam.

machine learning, pattern recognition, registration, (19 more...)

arXiv.org Artificial Intelligence

2505.06527

Country:

South America > Peru > Lima Department > Lima Province > Lima (0.04)
North America > United States > North Carolina > Watauga County > Boone (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.95)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.98)

Add feedback

Compact and Efficient Neural Networks for Image Recognition Based on Learned 2D Separable Transform

Vashkevich, Maxim, Krivalcevich, Egor

arXiv.org Artificial IntelligenceMay-13-2025

The paper presents a learned two-dimensional separable transform (LST) that can be considered as a new type of computational layer for constructing neural network (NN) architecture for image recognition tasks. The LST based on the idea of sharing the weights of one fullyconnected (FC) layer to process all rows of an image. After that, a second shared FC layer is used to process all columns of image representation obtained from the first layer. The use of LST layers in a NN architecture significantly reduces the number of model parameters compared to models that use stacked FC layers. We show that a NN-classifier based on a single LST layer followed by an FC layer achieves 98.02\% accuracy on the MNIST dataset, while having only 9.5k parameters. We also implemented a LST-based classifier for handwritten digit recognition on the FPGA platform to demonstrate the efficiency of the suggested approach for designing a compact and high-performance implementation of NN models. Git repository with supplementary materials: https://github.com/Mak-Sim/LST-2d

artificial intelligence, machine learning, pattern recognition, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/DSPA64310.2025.10977914

2505.06578

Country:

Europe > France (0.04)
Europe > Belarus > Minsk Region > Minsk (0.04)
Asia > Mongolia (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.62)

Add feedback

Nature's Insight: A Novel Framework and Comprehensive Analysis of Agentic Reasoning Through the Lens of Neuroscience

Liu, Zinan, Li, Haoran, Lu, Jingyi, Ma, Gaoyuan, Hong, Xu, Iacca, Giovanni, Kumar, Arvind, Tang, Shaojun, Wang, Lin

arXiv.org Artificial IntelligenceMay-12-2025

Autonomous AI is no longer a hard-to-reach concept, it enables the agents to move beyond executing tasks to independently addressing complex problems, adapting to change while handling the uncertainty of the environment. However, what makes the agents truly autonomous? It is agentic reasoning, that is crucial for foundation models to develop symbolic logic, statistical correlations, or large-scale pattern recognition to process information, draw inferences, and make decisions. However, it remains unclear why and how existing agentic reasoning approaches work, in comparison to biological reasoning, which instead is deeply rooted in neural mechanisms involving hierarchical cognition, multimodal integration, and dynamic interactions. In this work, we propose a novel neuroscience-inspired framework for agentic reasoning. Grounded in three neuroscience-based definitions and supported by mathematical and biological foundations, we propose a unified framework modeling reasoning from perception to action, encompassing four core types, perceptual, dimensional, logical, and interactive, inspired by distinct functional roles observed in the human brain. We apply this framework to systematically classify and analyze existing AI reasoning methods, evaluating their theoretical foundations, computational designs, and practical limitations. We also explore its implications for building more generalizable, cognitively aligned agents in physical and virtual environments. Finally, building on our framework, we outline future directions and propose new neural-inspired reasoning methods, analogous to chain-of-thought prompting. By bridging cognitive neuroscience and AI, this work offers a theoretical foundation and practical roadmap for advancing agentic reasoning in intelligent systems. The associated project can be found at: https://github.com/BioRAILab/Awesome-Neuroscience-Agent-Reasoning .

large language model, pattern recognition, simulation of human behavior, (25 more...)

arXiv.org Artificial Intelligence

2505.05515

Country:

North America > United States (0.45)
Asia > China (0.28)
Europe > United Kingdom > England (0.27)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.45)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
(12 more...)

Add feedback

Advanced Clustering Framework for Semiconductor Image Analytics Integrating Deep TDA with Self-Supervised and Transfer Learning Techniques

Giri, Janhavi, Lengyel, Attila, Kent, Don, Kibardin, Edward

arXiv.org Artificial IntelligenceMay-8-2025

Semiconductor manufacturing generates vast amounts of image data, crucial for defect identification and yield optimization, yet often exceeds manual inspection capabilities. Traditional clustering techniques struggle with high-dimensional, unlabeled data, limiting their effectiveness in capturing nuanced patterns. This paper introduces an advanced clustering framework that integrates deep Topological Data Analysis (TDA) with self-supervised and transfer learning techniques, offering a novel approach to unsupervised image clustering. TDA captures intrinsic topological features, while self-supervised learning extracts meaningful representations from unlabeled data, reducing reliance on labeled datasets. Transfer learning enhances the framework's adaptability and scalability, allowing fine-tuning to new datasets without retraining from scratch. Validated on synthetic and open-source semiconductor image datasets, the framework successfully identifies clusters aligned with defect patterns and process variations. This study highlights the transformative potential of combining TDA, self-supervised learning, and transfer learning, providing a scalable solution for proactive process monitoring and quality control in semiconductor manufacturing and other domains with large-scale image datasets.

category, machine learning, pattern recognition, (18 more...)

arXiv.org Artificial Intelligence

2505.03848

Country: Europe (0.28)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry:

Semiconductors & Electronics (1.00)
Information Technology > Hardware (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
(2 more...)

Add feedback

DOTA: Deformable Optimized Transformer Architecture for End-to-End Text Recognition with Retrieval-Augmented Generation

Nithisopa, Naphat, Panboonyuen, Teerapong

arXiv.org Artificial IntelligenceMay-8-2025

Text recognition in natural images remains a challenging yet essential task, with broad applications spanning computer vision and natural language processing. This paper introduces a novel end-to-end framework that combines ResNet and Vision Transformer backbones with advanced methodologies, including Deformable Convolutions, Retrieval-Augmented Generation, and Conditional Random Fields (CRF). These innovations collectively enhance feature representation and improve Optical Character Recognition (OCR) performance. Specifically, the framework substitutes standard convolution layers in the third and fourth blocks with Deformable Convolutions, leverages adaptive dropout for regularization, and incorporates CRF for more refined sequence modeling. Extensive experiments conducted on six benchmark datasets IC13, IC15, SVT, IIIT5K, SVTP, and CUTE80 validate the proposed method's efficacy, achieving notable accuracies: 97.32% on IC13, 58.26% on IC15, 88.10% on SVT, 74.13% on IIIT5K, 82.17% on SVTP, and 66.67% on CUTE80, resulting in an average accuracy of 77.77%. These results establish a new state-of-the-art for text recognition, demonstrating the robustness of the approach across diverse and challenging datasets.

large language model, machine learning, pattern recognition, (16 more...)

arXiv.org Artificial Intelligence

2505.04175

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

D-Tracker: Modeling Interest Diffusion in Social Activity Tensor Data Streams

Higashiguchi, Shingo, Matsubara, Yasuko, Kawabata, Koki, Murayama, Taichi, Sakurai, Yasushi

arXiv.org Artificial IntelligenceMay-2-2025

Large quantities of social activity data, such as weekly web search volumes and the number of new infections with infectious diseases, reflect peoples' interests and activities. It is important to discover temporal patterns from such data and to forecast future activities accurately. However, modeling and forecasting social activity data streams is difficult because they are high-dimensional and composed of multiple time-varying dynamics such as trends, seasonality, and interest diffusion. In this paper, we propose D-Tracker, a method for continuously capturing time-varying temporal patterns within social activity tensor data streams and forecasting future activities. Our proposed method has the following properties: (a) Interpretable: it incorporates the partial differential equation into a tensor decomposition framework and captures time-varying temporal patterns such as trends, seasonality, and interest diffusion between locations in an interpretable manner; (b) Automatic: it has no hyperparameters and continuously models tensor data streams fully automatically; (c) Scalable: the computation time of D-Tracker is independent of the time series length. Experiments using web search volume data obtained from GoogleTrends, and COVID-19 infection data obtained from COVID-19 Open Data Repository show that our method can achieve higher forecasting accuracy in less computation time than existing methods while extracting the interest diffusion between locations. Our source code and datasets are available at {https://github.com/Higashiguchi-Shingo/D-Tracker.

d-tracker, machine learning, pattern recognition, (24 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3690624.3709192

2505.00242

Country:

North America > United States (0.28)
Asia > Japan > Honshū (0.15)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
(5 more...)

Add feedback

ADT and Yale partner on Z-Wave lock with fingerprint recognition

PCWorldApr-30-2025, 13:00:00 GMT

ADT offers Yale Assure locks with its ADT home security systems, and now the security service provider has partnered with Yale and the Z-Wave Alliance to introduce the Yale Assure Lock 2 Touch with Z-Wave. This is the first Z-Wave lock with fingerprint recognition that is certified to use the Z-Wave User Credential Command Class specification that was released in June 2024. The new lock also features the latest generation Z-Wave 800 chipset, which promises longer battery life and improved range on a Z-Wave mesh network. Thanks to its use of the Z-Wave User Credential Command Class spec, ADT subscribers will be able arm and disarm their security system at the same time they lock or unlock the new deadbolt, all by just touching their previously enrolled finger to the new lock. ADT offers the Yale Assure Lock 2 Touch with Z-Wave with its ADT home security systems, which can be self- or professionally installed.

artificial intelligence, machine learning, pattern recognition, (10 more...)

PCWorld

Industry: Information Technology > Smart Houses & Appliances (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Fingerprint Recognition (0.62)

Add feedback

ProFi-Net: Prototype-based Feature Attention with Curriculum Augmentation for WiFi-based Gesture Recognition

Cui, Zhe, Zhang, Shuxian, Lou, Kangzhi, Tran, Le-Nam

arXiv.org Artificial IntelligenceApr-30-2025

This paper presents ProFi-Net, a novel few-shot learning framework for WiFi-based gesture recognition that overcomes the challenges of limited training data and sparse feature representations. ProFi-Net employs a prototype-based metric learning architecture enhanced with a feature-level attention mechanism, which dynamically refines the Euclidean distance by emphasizing the most discriminative feature dimensions. Additionally, our approach introduces a curriculum-inspired data augmentation strategy exclusively on the query set. By progressively incorporating Gaussian noise of increasing magnitude, the model is exposed to a broader range of challenging variations, thereby improving its generalization and robustness to overfitting. Extensive experiments conducted across diverse real-world environments demonstrate that ProFi-Net significantly outperforms conventional prototype networks and other state-of-the-art few-shot learning methods in terms of classification accuracy and training efficiency.

machine learning, pattern recognition, recognition, (17 more...)

arXiv.org Artificial Intelligence

2504.20193

Genre: Research Report (0.64)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Gesture Recognition (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.78)

Add feedback