AITopics | Pattern Recognition

Collaborating Authors

Pattern Recognition

"... the research area that studies the operation and design of systems that recognize patterns in data." It includes statistical methods like discriminant analysis, feature extraction, error estimation, cluster analysis.
– Pattern Recognition Laboratory at Delft University of Technology

News Overviews Instructional Materials AI-Alerts Classics

FAIR-SIGHT: Fairness Assurance in Image Recognition via Simultaneous Conformal Thresholding and Dynamic Output Repair

Fayyazi, Arya, Kamal, Mehdi, Pedram, Massoud

arXiv.org Artificial IntelligenceApr-11-2025

We introduce FAIR-SIGHT, an innovative post-hoc framework designed to ensure fairness in computer vision systems by combining conformal prediction with a dynamic output repair mechanism. Our approach calculates a fairness-aware non-conformity score that simultaneously assesses prediction errors and fairness violations. Using conformal prediction, we establish an adaptive threshold that provides rigorous finite-sample, distribution-free guarantees. When the non-conformity score for a new image exceeds the calibrated threshold, FAIR-SIGHT implements targeted corrective adjustments, such as logit shifts for classification and confidence recalibration for detection, to reduce both group and individual fairness disparities, all without the need for retraining or having access to internal model parameters. Comprehensive theoretical analysis validates our method's error control and convergence properties. At the same time, extensive empirical evaluations on benchmark datasets show that FAIR-SIGHT significantly reduces fairness disparities while preserving high predictive performance.

artificial intelligence, machine learning, pattern recognition, (19 more...)

arXiv.org Artificial Intelligence

2504.07395

Country: North America > United States > California > Los Angeles County > Los Angeles (0.29)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.40)

Add feedback

Temporal Alignment-Free Video Matching for Few-shot Action Recognition

Lee, SuBeen, Moon, WonJun, Seong, Hyun Seok, Heo, Jae-Pil

arXiv.org Artificial IntelligenceApr-9-2025

Few-Shot Action Recognition (FSAR) aims to train a model with only a few labeled video instances. A key challenge in FSAR is handling divergent narrative trajectories for precise video matching. While the frame- and tuple-level alignment approaches have been promising, their methods heavily rely on pre-defined and length-dependent alignment units (e.g., frames or tuples), which limits flexibility for actions of varying lengths and speeds. In this work, we introduce a novel TEmporal Alignment-free Matching (TEAM) approach, which eliminates the need for temporal units in action representation and brute-force alignment during matching. Specifically, TEAM represents each video with a fixed set of pattern tokens that capture globally discriminative clues within the video instance regardless of action length or speed, ensuring its flexibility. Furthermore, TEAM is inherently efficient, using token-wise comparisons to measure similarity between videos, unlike existing methods that rely on pairwise comparisons for temporal alignment. Additionally, we propose an adaptation process that identifies and removes common information across classes, establishing clear boundaries even between novel categories. Extensive experiments demonstrate the effectiveness of TEAM. Codes are available at github.com/leesb7426/TEAM.

artificial intelligence, machine learning, pattern recognition, (15 more...)

arXiv.org Artificial Intelligence

2504.05956

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.68)

Add feedback

Google AI Mode rolls out to more testers with new image search feature

EngadgetApr-7-2025, 16:00:54 GMT

Google is bringing AI Mode to more people in the US. The company announced on Monday it would make the new search tool, first launched at the start of last month, to millions of more Labs users across the country. For uninitiated, AI Mode is a new dedicated tab within Search. It allows you to ask more complicated questions of Google, with a custom version of Gemini 2.0 doing the legwork to deliver a nuanced AI-generated response. Labs, meanwhile, is a beta program you can enroll your Google account in to gain access to new Search features before the company rolls them out to the public.

ai mode, google, new image search feature, (3 more...)

Engadget

Country: North America > United States (0.28)

Technology:

Information Technology > Information Management > Search (0.64)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.41)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.41)
(2 more...)

Add feedback

IMPACT: A Generic Semantic Loss for Multimodal Medical Image Registration

Boussot, Valentin, Hémon, Cédric, Nunes, Jean-Claude, Downling, Jason, Rouzé, Simon, Lafond, Caroline, Barateau, Anaïs, Dillenseger, Jean-Louis

arXiv.org Artificial IntelligenceApr-3-2025

Image registration is fundamental in medical imaging, enabling precise alignment of anatomical structures for diagnosis, treatment planning, image-guided interventions, and longitudinal monitoring. This work introduces IMPACT (Image Metric with Pretrained model-Agnostic Comparison for Transmodality registration), a novel similarity metric designed for robust multimodal image registration. Rather than relying on raw intensities, handcrafted descriptors, or task-specific training, IMPACT defines a semantic similarity measure based on the comparison of deep features extracted from large-scale pretrained segmentation models. By leveraging representations from models such as TotalSegmentator, Segment Anything (SAM), and other foundation networks, IMPACT provides a task-agnostic, training-free solution that generalizes across imaging modalities. These features, originally trained for segmentation, offer strong spatial correspondence and semantic alignment capabilities, making them naturally suited for registration. The method integrates seamlessly into both algorithmic (Elastix) and learning-based (VoxelMorph) frameworks, leveraging the strengths of each. IMPACT was evaluated on five challenging 3D registration tasks involving thoracic CT/CBCT and pelvic MR/CT datasets. Quantitative metrics, including Target Registration Error and Dice Similarity Coefficient, demonstrated consistent improvements in anatomical alignment over baseline methods. Qualitative analyses further highlighted the robustness of the proposed metric in the presence of noise, artifacts, and modality variations. With its versatility, efficiency, and strong performance across diverse tasks, IMPACT offers a powerful solution for advancing multimodal image registration in both clinical and research settings.

machine learning, pattern recognition, registration, (22 more...)

arXiv.org Artificial Intelligence

2503.24121

Country:

Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.04)
Oceania > Australia > Queensland (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Detection of Anomalous Vehicular Traffic and Sensor Failures Using Data Clustering Techniques

Moretti, Davide, Onofri, Elia, Cristiani, Emiliano

arXiv.org Artificial IntelligenceApr-1-2025

The increasing availability of traffic data from sensor networks has created new opportunities for understanding vehicular dynamics and identifying anomalies. In this study, we employ clustering techniques to analyse traffic flow data with the dual objective of uncovering meaningful traffic patterns and detecting anomalies, including sensor failures and irregular congestion events. We explore multiple clustering approaches, i.e. partitioning and hierarchical methods, combined with various time-series representations and similarity measures. Our methodology is applied to real-world data from highway sensors, enabling us to assess the impact of different clustering frameworks on traffic pattern recognition. We also introduce a clustering-driven anomaly detection methodology that identifies deviations from expected traffic behaviour based on distance-based anomaly scores. Results indicate that hierarchical clustering with symbolic representations provides robust segmentation of traffic patterns, while partitioning methods such as k -means and fuzzy c-means yield meaningful results when paired with Dynamic Time Warping. The proposed anomaly detection strategy successfully identifies sensor malfunctions and abnormal traffic conditions with minimal false positives, demonstrating its practical utility for real-time monitoring. Real-world vehicular traffic data are provided by Autostrade Alto Adriatico S.p.A. Keywords.

anomaly, machine learning, pattern recognition, (18 more...)

arXiv.org Artificial Intelligence

2504.00881

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > District of Columbia > Washington (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry: Transportation > Ground > Road (0.94)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)

Add feedback

InkFM: A Foundational Model for Full-Page Online Handwritten Note Understanding

Fadeeva, Anastasiia, Coriou, Vincent, Antognini, Diego, Musat, Claudiu, Maksai, Andrii

arXiv.org Artificial IntelligenceMar-29-2025

Tablets and styluses are increasingly popular for taking notes. To optimize this experience and ensure a smooth and efficient workflow, it's important to develop methods for accurately interpreting and understanding the content of handwritten digital notes. We introduce a foundational model called InkFM for analyzing full pages of handwritten content. Trained on a diverse mixture of tasks, this model offers a unique combination of capabilities: recognizing text in 28 different scripts, mathematical expressions recognition, and segmenting pages into distinct elements like text and drawings. Our results demonstrate that these tasks can be effectively unified within a single model, achieving SoTA text line segmentation out-of-the-box quality surpassing public baselines like docTR. Fine- or LoRA-tuning our base model on public datasets further improves the quality of page segmentation, achieves state-of the art text recognition (DeepWriting, CASIA, SCUT, and Mathwriting datasets) and sketch classification (QuickDraw). This adaptability of InkFM provides a powerful starting point for developing applications with handwritten input.

large language model, machine learning, pattern recognition, (19 more...)

arXiv.org Artificial Intelligence

2503.23081

Country: North America > United States (0.04)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.67)

Add feedback

Masked Self-Supervised Pre-Training for Text Recognition Transformers on Large-Scale Datasets

Kišš, Martin, Hradiš, Michal

arXiv.org Artificial IntelligenceMar-28-2025

Self-supervised learning has emerged as a powerful approach for leveraging large-scale unlabeled data to improve model performance in various domains. In this paper, we explore masked self-supervised pre-training for text recognition transformers. Specifically, we propose two modifications to the pre-training phase: progressively increasing the masking probability, and modifying the loss function to incorporate both masked and non-masked patches. We conduct extensive experiments using a dataset of 50M unlabeled text lines for pre-training and four differently sized annotated datasets for fine-tuning. Furthermore, we compare our pre-trained models against those trained with transfer learning, demonstrating the effectiveness of the self-supervised pre-training. In particular, pre-training consistently improves the character error rate of models, in some cases up to 30 % relatively. It is also on par with transfer learning but without relying on extra annotated text lines.

large language model, machine learning, pattern recognition, (18 more...)

arXiv.org Artificial Intelligence

2503.22513

Country:

Europe > Switzerland (0.05)
Europe > Czechia > South Moravian Region > Brno (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.65)

Add feedback

Robust Flower Cluster Matching Using The Unscented Transform

Chu, Andy, Shrestha, Rashik, Gu, Yu, Gross, Jason N.

arXiv.org Artificial IntelligenceMar-26-2025

-- Monitoring flowers over time is essential for precision robotic pollination in agriculture. T o accomplish this, a continuous spatial-temporal observation of plant growth can be done using stationary RGB-D cameras. However, image registration becomes a serious challenge due to changes in the visual appearance of the plant caused by the pollination process and occlusions from growth and camera angles. Plants flower in a manner that produces distinct clusters on branches. This paper presents a method for matching flower clusters using descriptors generated from RGB-D data and considers allowing for spatial uncertainty within the cluster . The proposed approach leverages the Unscented Transform to efficiently estimate plant descriptor uncertainty tolerances, enabling a robust image-registration process despite temporal changes. The Unscented Transform is used to handle the nonlinear transformations by propagating the uncertainty of flower positions to determine the variations in the descriptor domain. A Monte Carlo simulation is used to validate the Unscented Transform results, confirming our method's effectiveness for flower cluster matching. Therefore, it can facilitate improved robotics pollination in dynamic environments. Although global agriculture relies heavily on pollination, evidence has shown that the population of natural pollinators is decreasing, raising concerns about food and the economy [1].

artificial intelligence, machine learning, pattern recognition, (21 more...)

arXiv.org Artificial Intelligence

2503.20631

Country:

North America > United States > West Virginia (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Food & Agriculture > Agriculture (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.54)
(2 more...)

Add feedback

Identifying and Characterising Higher Order Interactions in Mobility Networks Using Hypergraphs

Sambaturu, Prathyush, Gutierrez, Bernardo, Kraemer, Moritz U. G.

arXiv.org Artificial IntelligenceMar-24-2025

Human mobility data is crucial for understanding patterns of movement across geographical regions, with applications spanning urban planning[1], transportation systems design[2], infectious disease modeling and control [3, 4], and social dynamics studies [5]. Traditionally, mobility data has been represented using flow networks[6, 7] or colocation matrices [8], where the primary representation is via pairwise interactions. In flow networks, this means directed edges represent the movement of individuals between two locations; colocation matrices measure the probability that a random individual from a region is colocated with a random individual from another region at the same location. These data types and their pairwise representation structure have been used to identify the spatial scales and regularity of human mobility, but have inherent limitations in their capacity to capture more complex patterns of human movement involving higher-order interactions between locations - that is, group of locations that are frequently visited by many individuals within a period of time (e.g., a week) and revisited regularly over time. Higher-order interactions between locations can contain crucial information under certain scenarios.

artificial intelligence, machine learning, pattern recognition, (17 more...)

arXiv.org Artificial Intelligence

2503.18572

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.48)
Transportation > Infrastructure & Services (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.46)

Add feedback

Enhancing Zero-Shot Image Recognition in Vision-Language Models through Human-like Concept Guidance

Liu, Hui, Wang, Wenya, Chen, Kecheng, Liu, Jie, Liu, Yibing, Qin, Tiexin, He, Peisong, Jiang, Xinghao, Li, Haoliang

arXiv.org Artificial IntelligenceMar-20-2025

In zero-shot image recognition tasks, humans demonstrate remarkable flexibility in classifying unseen categories by composing known simpler concepts. However, existing vision-language models (VLMs), despite achieving significant progress through large-scale natural language supervision, often underperform in real-world applications because of sub-optimal prompt engineering and the inability to adapt effectively to target classes. To address these issues, we propose a Concept-guided Human-like Bayesian Reasoning (CHBR) framework. Grounded in Bayes' theorem, CHBR models the concept used in human image recognition as latent variables and formulates this task by summing across potential concepts, weighted by a prior distribution and a likelihood function. To tackle the intractable computation over an infinite concept space, we introduce an importance sampling algorithm that iteratively prompts large language models (LLMs) to generate discriminative concepts, emphasizing inter-class differences. We further propose three heuristic approaches involving Average Likelihood, Confidence Likelihood, and Test Time Augmentation (TTA) Likelihood, which dynamically refine the combination of concepts based on the test image. Extensive evaluations across fifteen datasets demonstrate that CHBR consistently outperforms existing state-of-the-art zero-shot generalization methods.

large language model, machine learning, pattern recognition, (20 more...)

arXiv.org Artificial Intelligence

2503.15886

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China > Hong Kong (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.82)

Add feedback