AITopics | Pattern Recognition

Collaborating Authors

Pattern Recognition

"... the research area that studies the operation and design of systems that recognize patterns in data." It includes statistical methods like discriminant analysis, feature extraction, error estimation, cluster analysis.
– Pattern Recognition Laboratory at Delft University of Technology

News Overviews Instructional Materials AI-Alerts Classics

Classification Drives Geographic Bias in Street Scene Segmentation

Nair, Rahul, Tseng, Gabriel, Rolf, Esther, Tokas, Bhanu, Kerner, Hannah

arXiv.org Artificial IntelligenceDec-15-2024

Previous studies showed that image datasets lacking geographic diversity can lead to biased performance in models trained on them. While earlier work studied general-purpose image datasets (e.g., ImageNet) and simple tasks like image recognition, we investigated geo-biases in real-world driving datasets on a more complex task: instance segmentation. We examined if instance segmentation models trained on European driving scenes (Eurocentric models) are geo-biased. Consistent with previous work, we found that Eurocentric models were geo-biased. Interestingly, we found that geo-biases came from classification errors rather than localization errors, with classification errors alone contributing 10-90% of the geo-biases in segmentation and 19-88% of the geo-biases in detection. This showed that while classification is geo-biased, localization (including detection and segmentation) is geographically robust. Our findings show that in region-specific models (e.g., Eurocentric models), geo-biases from classification errors can be significantly mitigated by using coarser classes (e.g., grouping car, bus, and truck as 4-wheeler).

artificial intelligence, machine learning, pattern recognition, (18 more...)

arXiv.org Artificial Intelligence

2412.11061

Country:

Oceania (0.05)
North America > United States > Arizona (0.04)
Europe > Germany (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry: Transportation > Ground > Road (0.33)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.35)

Add feedback

GraSP: Simple yet Effective Graph Similarity Predictions

Zheng, Haoran, Shi, Jieming, Yang, Renchi

arXiv.org Artificial IntelligenceDec-13-2024

Graph similarity computation (GSC) is to calculate the similarity between one pair of graphs, which is a fundamental problem with fruitful applications in the graph community. In GSC, graph edit distance (GED) and maximum common subgraph (MCS) are two important similarity metrics, both of which are NP-hard to compute. Instead of calculating the exact values, recent solutions resort to leveraging graph neural networks (GNNs) to learn data-driven models for the estimation of GED and MCS. Most of them are built on components involving node-level interactions crossing graphs, which engender vast computation overhead but are of little avail in effectiveness. In the paper, we present GraSP, a simple yet effective GSC approach for GED and MCS prediction. GraSP achieves high result efficacy through several key instruments: enhanced node features via positional encoding and a GNN model augmented by a gating mechanism, residual connections, as well as multi-scale pooling. Theoretically, GraSP can surpass the 1-WL test, indicating its high expressiveness. Empirically, extensive experiments comparing GraSP against 10 competitors on multiple widely adopted benchmark datasets showcase the superiority of GraSP over prior arts in terms of both effectiveness and efficiency. The code is available at https://github.com/HaoranZ99/GraSP.

artificial intelligence, machine learning, pattern recognition, (20 more...)

arXiv.org Artificial Intelligence

2412.09968

Country: Asia > China > Hong Kong (0.05)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Add feedback

On Round-Off Errors and Gaussian Blur in Superresolution and in Image Registration

Savari, Serap A.

arXiv.org Artificial IntelligenceDec-12-2024

Superresolution theory and techniques seek to recover signals from samples in the presence of blur and noise. Discrete image registration can be an approach to fuse information from different sets of samples of the same signal. Quantization errors in the spatial domain are inherent to digital images. We consider superresolution and discrete image registration for one-dimensional spatially-limited piecewise constant functions which are subject to blur which is Gaussian or a mixture of Gaussians as well as to round-off errors. We describe a signal-dependent measurement matrix which captures both types of effects. For this setting we show that the difficulties in determining the discontinuity points from two sets of samples even in the absence of other types of noise. If the samples are also subject to statistical noise, then it is necessary to align and segment the data sequences to make the most effective inferences about the amplitudes and discontinuity points. Under some conditions on the blur, the noise, and the distance between discontinuity points, we prove that we can correctly align and determine the first samples following each discontinuity point in two data sequences with an approach based on dynamic programming.

discontinuity point, machine learning, pattern recognition, (18 more...)

arXiv.org Artificial Intelligence

2412.09741

Country:

North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.81)

Add feedback

In-Application Defense Against Evasive Web Scans through Behavioral Analysis

Ousat, Behzad, Shariatnasab, Mahshad, Schafir, Esteban, Chaharsooghi, Farhad Shirani, Kharraz, Amin

arXiv.org Artificial IntelligenceDec-9-2024

Web traffic has evolved to include both human users and automated agents, ranging from benign web crawlers to adversarial scanners such as those capable of credential stuffing, command injection, and account hijacking at the web scale. The estimated financial costs of these adversarial activities are estimated to exceed tens of billions of dollars in 2023. In this work, we introduce WebGuard, a low-overhead in-application forensics engine, to enable robust identification and monitoring of automated web scanners, and help mitigate the associated security risks. WebGuard focuses on the following design criteria: (i) integration into web applications without any changes to the underlying software components or infrastructure, (ii) minimal communication overhead, (iii) capability for real-time detection, e.g., within hundreds of milliseconds, and (iv) attribution capability to identify new behavioral patterns and detect emerging agent categories. To this end, we have equipped WebGuard with multi-modal behavioral monitoring mechanisms, such as monitoring spatio-temporal data and browser events. We also design supervised and unsupervised learning architectures for real-time detection and offline attribution of human and automated agents, respectively. Information theoretic analysis and empirical evaluations are provided to show that multi-modal data analysis, as opposed to uni-modal analysis which relies solely on mouse movement dynamics, significantly improves time-to-detection and attribution accuracy. Various numerical evaluations using real-world data collected via WebGuard are provided achieving high accuracy in hundreds of milliseconds, with a communication overhead below 10 KB per second.

data mining, machine learning, pattern recognition, (22 more...)

arXiv.org Artificial Intelligence

2412.07005

Country:

North America > United States > Florida > Hillsborough County > University (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Law Enforcement & Public Safety (0.86)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

Text Change Detection in Multilingual Documents Using Image Comparison

Park, Doyoung, Yarram, Naresh Reddy, Kim, Sunjin, Kim, Minkyu, Cho, Seongho, Lee, Taehee

arXiv.org Artificial IntelligenceDec-5-2024

Document comparison typically relies on optical character recognition (OCR) as its core technology. However, OCR requires the selection of appropriate language models for each document and the performance of multilingual or hybrid models remains limited. To overcome these challenges, we propose text change detection (TCD) using an image comparison model tailored for multilingual documents. Unlike OCR-based approaches, our method employs word-level text image-to-image comparison to detect changes. Our model generates bidirectional change segmentation maps between the source and target documents. To enhance performance without requiring explicit text alignment or scaling preprocessing, we employ correlations among multi-scale attention features. We also construct a benchmark dataset comprising actual printed and scanned word pairs in various languages to evaluate our model. We validate our approach using our benchmark dataset and public benchmarks Distorted Document Images and the LRDE Document Binarization Dataset. We compare our model against state-of-the-art semantic segmentation and change detection models, as well as to conventional OCR-based models.

dataset, detection, recognition, (15 more...)

arXiv.org Artificial Intelligence

2412.04137

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

A Bidirectional Siamese Recurrent Neural Network for Accurate Gait Recognition Using Body Landmarks

Progga, Proma Hossain, Rahman, Md. Jobayer, Biswas, Swapnil, Ahmed, Md. Shakil, Anwary, Arif Reza, Shatabda, Swakkhar

arXiv.org Artificial IntelligenceDec-4-2024

Gait recognition is a significant biometric technique for person identification, particularly in scenarios where other physiological biometrics are impractical or ineffective. In this paper, we address the challenges associated with gait recognition and present a novel approach to improve its accuracy and reliability. The proposed method leverages advanced techniques, including sequential gait landmarks obtained through the Mediapipe pose estimation model, Procrustes analysis for alignment, and a Siamese biGRU-dualStack Neural Network architecture for capturing temporal dependencies. Extensive experiments were conducted on large-scale cross-view datasets to demonstrate the effectiveness of the approach, achieving high recognition accuracy compared to other models. The model demonstrated accuracies of 95.7%, 94.44%, 87.71%, and 86.6% on CASIA-B, SZU RGB-D, OU-MVLP, and Gait3D datasets respectively. The results highlight the potential applications of the proposed method in various practical domains, indicating its significant contribution to the field of gait recognition.

artificial intelligence, machine learning, pattern recognition, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.neucom.2024.128313

2412.03498

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
Europe > United Kingdom (0.04)
(2 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Research on Cervical Cancer p16/Ki-67 Immunohistochemical Dual-Staining Image Recognition Algorithm Based on YOLO

Wu, Xiao-Jun, Zhao, Cai-Jun, Meng, Chun, Wang, Hang

arXiv.org Artificial IntelligenceDec-2-2024

The p16/Ki-67 dual staining method is a new approach for cervical cancer screening with high sensitivity and specificity. However, there are issues of mis-detection and inaccurate recognition when the YOLOv5s algorithm is directly applied to dual-stained cell images. This paper Proposes a novel cervical cancer dual-stained image recognition (DSIR-YOLO) model based on an YOLOv5. By fusing the Swin-Transformer module, GAM attention mechanism, multi-scale feature fusion, and EIoU loss function, the detection performance is significantly improved, with mAP@0.5 and mAP@0.5:0.95 reaching 92.6% and 70.5%, respectively. Compared with YOLOv5s in five-fold cross-validation, the accuracy, recall, mAP@0.5, and mAP@0.5:0.95 of the improved algorithm are increased by 2.3%, 4.1%, 4.3%, and 8.0%, respectively, with smaller variances and higher stability. Compared with other detection algorithms, DSIR-YOLO in this paper sacrifices some performance requirements to improve the network recognition effect. In addition, the influence of dataset quality on the detection results is studied. By controlling the sealing property of pixels, scale difference, unlabelled cells, and diagonal annotation, the model detection accuracy, recall, mAP@0.5, and mAP@0.5:0.95 are improved by 13.3%, 15.3%, 18.3%, and 30.5%, respectively.

accuracy, information, positive cell, (15 more...)

arXiv.org Artificial Intelligence

2412.01372

Country:

Asia > Singapore (0.04)
Asia > China > Fujian Province > Fuzhou (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology > Cervical Cancer (0.94)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

NeuroAI for AI Safety

Mineault, Patrick, Zanichelli, Niccolò, Peng, Joanne Zichen, Arkhipov, Anton, Bingham, Eli, Jara-Ettinger, Julian, Mackevicius, Emily, Marblestone, Adam, Mattar, Marcelo, Payne, Andrew, Sanborn, Sophia, Schroeder, Karen, Tavares, Zenna, Tolias, Andreas

arXiv.org Artificial IntelligenceNov-27-2024

As AI systems become increasingly powerful, the need for safe AI has become more pressing. Humans are an attractive model for AI safety: as the only known agents capable of general intelligence, they perform robustly even under conditions that deviate significantly from prior experiences, explore the world safely, understand pragmatics, and can cooperate to meet their intrinsic goals. Intelligence, when coupled with cooperation and safety mechanisms, can drive sustained progress and well-being. These properties are a function of the architecture of the brain and the learning algorithms it implements. Neuroscience may thus hold important keys to technical AI safety that are currently underexplored and underutilized. In this roadmap, we highlight and critically evaluate several paths toward AI safety inspired by neuroscience: emulating the brain's representations, information processing, and architecture; building robust sensory and motor systems from imitating brain data and bodies; fine-tuning AI systems on brain data; advancing interpretability using neuroscience methods; and scaling up cognitively-inspired architectures. We make several concrete recommendations for how neuroscience can positively impact AI safety.

adversarially robust network, electrophysiological recording, functional specialization, (17 more...)

arXiv.org Artificial Intelligence

2411.18526

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(15 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)
Research Report > Promising Solution (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(11 more...)

Add feedback

Robust Dynamic Gesture Recognition at Ultra-Long Distances

Beeri, Eran Bamani, Nissinman, Eden, Sintov, Avishai

arXiv.org Artificial IntelligenceNov-27-2024

Dynamic hand gestures play a crucial role in conveying nonverbal information for Human-Robot Interaction (HRI), eliminating the need for complex interfaces. Current models for dynamic gesture recognition suffer from limitations in effective recognition range, restricting their application to close proximity scenarios. In this letter, we present a novel approach to recognizing dynamic gestures in an ultra-range distance of up to 28 meters, enabling natural, directive communication for guiding robots in both indoor and outdoor environments. Our proposed SlowFast-Transformer (SFT) model effectively integrates the SlowFast architecture with Transformer layers to efficiently process and classify gesture sequences captured at ultra-range distances, overcoming challenges of low resolution and environmental noise. We further introduce a distance-weighted loss function shown to enhance learning and improve model robustness at varying distances. Our model demonstrates significant performance improvement over state-of-the-art gesture recognition frameworks, achieving a recognition accuracy of 95.1% on a diverse dataset with challenging ultra-range gestures. This enables robots to react appropriately to human commands from a far distance, providing an essential enhancement in HRI, especially in scenarios requiring seamless and natural interaction.

gesture recognition, interaction, recognition, (14 more...)

arXiv.org Artificial Intelligence

2411.18413

Country: Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision > Gesture Recognition (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.96)
(2 more...)

Add feedback

PDZSeg: Adapting the Foundation Model for Dissection Zone Segmentation with Visual Prompts in Robot-assisted Endoscopic Submucosal Dissection

Xu, Mengya, Mo, Wenjin, Wang, Guankun, Gao, Huxin, Wang, An, Li, Zhen, Yang, Xiaoxiao, Ren, Hongliang

arXiv.org Artificial IntelligenceNov-27-2024

Endoscopic Submucosal Dissection (ESD) is a surgical procedure employed in the treatment of early-stage gastrointestinal cancers [1, 2]. This procedure entails a complex series of dissection maneuvers that require significant skill to determine the dissection zone. In traditional ESD, a transparent cap is employed to retract lesions, which can often obscure the view of the submucosal layer and lead to an incomplete dissection zone. Conversely, our robot-assisted ESD [3] offers better visualization of the submucosal layer, resulting in a more completed dissection zone by utilizing robotic instruments that enable independent control over retraction and dissection. Achieving successful submucosal dissection requires the careful excision of the lesion or mucosal layer along with the complete submucosal layer while ensuring that both the underlying muscular layer and the mucosal surface remain unharmed. If the electric knife inadvertently contacts tissue outside the designated dissection area, it can lead to damage to the muscle layer, increasing the risk of perforations. Such complications not only elevate the surgical risks but can also complicate the patient's recovery. Therefore, it is imperative to maintain a precise dissection zone during endoscopic procedures. Effective guidance can help ensure that surgeons are adept at identifying and adhering to appropriate dissection boundaries and enhance the overall safety of endoscopic submucosal dissection (ESD).

dissection zone, pdzseg, visual prompt, (11 more...)

arXiv.org Artificial Intelligence

2411.18169

Country:

North America > United States (0.14)
Asia > China > Hong Kong (0.05)
Asia > China > Guangdong Province > Shenzhen (0.05)
(4 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.94)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (0.93)
Information Technology > Sensing and Signal Processing > Image Processing (0.69)
(2 more...)

Add feedback