Goto

Collaborating Authors

 Pattern Recognition


Complex Networks for Pattern-Based Data Classification

arXiv.org Artificial Intelligence

Data classification techniques partition the data or feature space into smaller sub-spaces, each corresponding to a specific class. To classify into subspaces, physical features e.g., distance and distributions are utilized. This approach is challenging for the characterization of complex patterns that are embedded in the dataset. However, complex networks remain a powerful technique for capturing internal relationships and class structures, enabling High-Level Classification. Although several complex network-based classification techniques have been proposed, high-level classification by leveraging pattern formation to classify data has not been utilized. In this work, we present two network-based classification techniques utilizing unique measures derived from the Minimum Spanning Tree and Single Source Shortest Path. These network measures are evaluated from the data patterns represented by the inherent network constructed from each class. We have applied our proposed techniques to several data classification scenarios including synthetic and real-world datasets. Compared to the existing classic high-level and machine-learning classification techniques, we have observed promising numerical results for our proposed approaches. Furthermore, the proposed models demonstrate the following distinguished features in comparison to the previous high-level classification techniques: (1) A single network measure is introduced to characterize the data pattern, eliminating the need to determine weight parameters among network measures. Therefore, the model is largely simplified, while obtaining better classification results. (2) The metrics proposed are sensitive and used for classification with competitive results.


Comparing Native and Non-native English Speakers' Behaviors in Collaborative Writing through Visual Analytics

arXiv.org Artificial Intelligence

Understanding collaborative writing dynamics between native speakers (NS) and non-native speakers (NNS) is critical for enhancing collaboration quality and team inclusivity. In this paper, we partnered with communication researchers to develop visual analytics solutions for comparing NS and NNS behaviors in 162 writing sessions across 27 teams. The primary challenges in analyzing writing behaviors are data complexity and the uncertainties introduced by automated methods. In response, we present \textsc{COALA}, a novel visual analytics tool that improves model interpretability by displaying uncertainties in author clusters, generating behavior summaries using large language models, and visualizing writing-related actions at multiple granularities. We validated the effectiveness of \textsc{COALA} through user studies with domain experts (N=2+2) and researchers with relevant experience (N=8). We present the insights discovered by participants using \textsc{COALA}, suggest features for future AI-assisted collaborative writing tools, and discuss the broader implications for analyzing collaborative processes beyond writing.


Causal AI-based Root Cause Identification: Research to Practice at Scale

arXiv.org Artificial Intelligence

Modern applications are increasingly built as vast, intricate, distributed systems. These systems comprise various software modules, often developed by different teams using different programming languages and deployed across hundreds to thousands of machines, sometimes spanning multiple data centers. Given the ir scale and complexity, these applications are often designed to tolerate failures and performance issues through inbuilt failure recovery techniques (e.g., hardware or software redundancy) or extern al methods (e.g., health check - based restarts). Computer systems experience frequent failures despite every effort: performance degradations and violations of reliability and K ey Performance Indicators (K PI s) are inevitable. These failures, depending on their nature, can lead to catastrophic incidents impacting critical systems and customers. Swift and accurate root cause identification is thus essential to avert significant incidents impacting both service quality and end users. In this complex landscape, observability platforms that provide deep insights into system behavior and help identify performance bottlenecks are not just helpful -- they are essential for maintaining reliability, ensuring optimal performance, and quickly resolving issues in production. The ability to reason a bout these systems in real - time is critical to ensuring the scalability and stability of modern services. To aid in these investigations, observability platforms that collect various telemetry data t o inform about application behavior and its underlying infrastructure are getting popular .


Rewards-based image analysis in microscopy

arXiv.org Artificial Intelligence

Analyzing imaging and hyperspectral data is crucial across scientific fields, including biology, medicine, chemistry, and physics. The primary goal is to transform high-resolution or high-dimensional data into an interpretable format to generate actionable insights, aiding decision-making and advancing knowledge. Currently, this task relies on complex, human-designed workflows comprising iterative steps such as denoising, spatial sampling, keypoint detection, feature generation, clustering, dimensionality reduction, and physics-based deconvolutions. The introduction of machine learning over the past decade has accelerated tasks like image segmentation and object detection via supervised learning, and dimensionality reduction via unsupervised methods. However, both classical and NN-based approaches still require human input, whether for hyperparameter tuning, data labeling, or both. The growing use of automated imaging tools, from atomically resolved imaging to biological applications, demands unsupervised methods that optimize data representation for human decision-making or autonomous experimentation. Here, we discuss advances in reward-based workflows, which adopt expert decision-making principles and demonstrate strong transfer learning across diverse tasks. We represent image analysis as a decision-making process over possible operations and identify desiderata and their mappings to classical decision-making frameworks. Reward-driven workflows enable a shift from supervised, black-box models sensitive to distribution shifts to explainable, unsupervised, and robust optimization in image analysis. They can function as wrappers over classical and DCNN-based methods, making them applicable to both unsupervised and supervised workflows (e.g., classification, regression for structure-property mapping) across imaging and hyperspectral data.


Online hand gesture recognition using Continual Graph Transformers

arXiv.org Artificial Intelligence

Online continuous action recognition has emerged as a critical research area due to its practical implications in real-world applications, such as human-computer interaction, healthcare, and robotics. Among various modalities, skeleton-based approaches have gained significant popularity, demonstrating their effectiveness in capturing 3D temporal data while ensuring robustness to environmental variations. However, most existing works focus on segment-based recognition, making them unsuitable for real-time, continuous recognition scenarios. In this paper, we propose a novel online recognition system designed for real-time skeleton sequence streaming. Our approach leverages a hybrid architecture combining Spatial Graph Convolutional Networks (S-GCN) for spatial feature extraction and a Transformer-based Graph Encoder (TGE) for capturing temporal dependencies across frames. Additionally, we introduce a continual learning mechanism to enhance model adaptability to evolving data distributions, ensuring robust recognition in dynamic environments. We evaluate our method on the SHREC'21 benchmark dataset, demonstrating its superior performance in online hand gesture recognition. Our approach not only achieves state-of-the-art accuracy but also significantly reduces false positive rates, making it a compelling solution for real-time applications. The proposed system can be seamlessly integrated into various domains, including human-robot collaboration and assistive technologies, where natural and intuitive interaction is crucial.


Vision-Enhanced Time Series Forecasting via Latent Diffusion Models

arXiv.org Artificial Intelligence

Diffusion models have recently emerged as powerful frameworks for generating high-quality images. While recent studies have explored their application to time series forecasting, these approaches face significant challenges in cross-modal modeling and transforming visual information effectively to capture temporal patterns. In this paper, we propose LDM4TS, a novel framework that leverages the powerful image reconstruction capabilities of latent diffusion models for vision-enhanced time series forecasting. Instead of introducing external visual data, we are the first to use complementary transformation techniques to convert time series into multi-view visual representations, allowing the model to exploit the rich feature extraction capabilities of the pre-trained vision encoder. Subsequently, these representations are reconstructed using a latent diffusion model with a cross-modal conditioning mechanism as well as a fusion module. Experimental results demonstrate that LDM4TS outperforms various specialized forecasting models for time series forecasting tasks.


A Hybrid Edge Classifier: Combining TinyML-Optimised CNN with RRAM-CMOS ACAM for Energy-Efficient Inference

arXiv.org Artificial Intelligence

In recent years, the development of smart edge computing systems to process information locally is on the rise. Many near-sensor machine learning (ML) approaches have been implemented to introduce accurate and energy efficient template matching operations in resource-constrained edge sensing systems, such as wearables. To introduce novel solutions that can be viable for extreme edge cases, hybrid solutions combining conventional and emerging technologies have started to be proposed. Deep Neural Networks (DNN) optimised for edge application alongside new approaches of computing (both device and architecture -wise) could be a strong candidate in implementing edge ML solutions that aim at competitive accuracy classification while using a fraction of the power of conventional ML solutions. In this work, we are proposing a hybrid software-hardware edge classifier aimed at the extreme edge near-sensor systems. The classifier consists of two parts: (i) an optimised digital tinyML network, working as a front-end feature extractor, and (ii) a back-end RRAM-CMOS analogue content addressable memory (ACAM), working as a final stage template matching system. The combined hybrid system exhibits a competitive trade-off in accuracy versus energy metric with $E_{front-end}$ = $96.23 nJ$ and $E_{back-end}$ = $1.45 nJ$ for each classification operation compared with 78.06$\mu$J for the original teacher model, representing a 792-fold reduction, making it a viable solution for extreme edge applications.


EventSTR: A Benchmark Dataset and Baselines for Event Stream based Scene Text Recognition

arXiv.org Artificial Intelligence

Mainstream Scene Text Recognition (STR) algorithms are developed based on RGB cameras which are sensitive to challenging factors such as low illumination, motion blur, and cluttered backgrounds. In this paper, we propose to recognize the scene text using bio-inspired event cameras by collecting and annotating a large-scale benchmark dataset, termed EventSTR. It contains 9,928 high-definition (1280 * 720) event samples and involves both Chinese and English characters. We also benchmark multiple STR algorithms as the baselines for future works to compare. In addition, we propose a new event-based scene text recognition framework, termed SimC-ESTR. It first extracts the event features using a visual encoder and projects them into tokens using a Q-former module. More importantly, we propose to augment the vision tokens based on a memory mechanism before feeding into the large language models. A similarity-based error correction mechanism is embedded within the large language model to correct potential minor errors fundamentally based on contextual information. Extensive experiments on the newly proposed EventSTR dataset and two simulation STR datasets fully demonstrate the effectiveness of our proposed model. We believe that the dataset and algorithmic model can innovatively propose an event-based STR task and are expected to accelerate the application of event cameras in various industries. The source code and pre-trained models will be released on https://github.com/Event-AHU/EventSTR


Handwritten Text Recognition: A Survey

arXiv.org Artificial Intelligence

Handwritten Text Recognition (HTR) has become an essential field within pattern recognition and machine learning, with applications spanning historical document preservation to modern data entry and accessibility solutions. The complexity of HTR lies in the high variability of handwriting, which makes it challenging to develop robust recognition systems. This survey examines the evolution of HTR models, tracing their progression from early heuristic-based approaches to contemporary state-of-the-art neural models, which leverage deep learning techniques. The scope of the field has also expanded, with models initially capable of recognizing only word-level content progressing to recent end-to-end document-level approaches. Our paper categorizes existing work into two primary levels of recognition: (1) \emph{up to line-level}, encompassing word and line recognition, and (2) \emph{beyond line-level}, addressing paragraph- and document-level challenges. We provide a unified framework that examines research methodologies, recent advances in benchmarking, key datasets in the field, and a discussion of the results reported in the literature. Finally, we identify pressing research challenges and outline promising future directions, aiming to equip researchers and practitioners with a roadmap for advancing the field.


Decoding Complexity: Intelligent Pattern Exploration with CHPDA (Context Aware Hybrid Pattern Detection Algorithm)

arXiv.org Artificial Intelligence

Efficient data management is essential for organizations to ensure that sensitive information such as Personally Identifiable Information (PII), Protected Health Information (PHI) and financial records are systematically identified and protected. Effective classification aids in compliance with regulations such as the General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA), while mitigating security risks through real-time threat detection[3] Automated tools improve operational efficiency by streamlining access and eliminating redundancies. Customized classification systems fulfill global compliance requirements, while centralized control mechanisms enhance governance through unified policy enforcement.[4] Strategic data classification is crucial to achieve security, compliance, and operational effectiveness in the digital environment of today. Identifying PII and PHI across various data formats presents considerable challenges, particularly with unstructured data sets. Differences in encoding and file formats (e.g., PDFs, Word documents, databases, CSV, and other text files) and data storage systems complicate the consistent extraction of sensitive information [5]. Moreover, international regulations such as GDPR, HIPAA, and the California Consumer Privacy Act (CCPA) impose varied compliance mandates, adding further complexity to detection efforts. Customizing detection mechanisms to align with region-specific regulations while ensuring accuracy across different content types is formidable. The necessity for real-time detection and the reduction of false positives amplifies this challenge, necessitating advanced algorithms and comprehensive data management strategies.