AITopics | Pattern Recognition

Collaborating Authors

Pattern Recognition

"... the research area that studies the operation and design of systems that recognize patterns in data." It includes statistical methods like discriminant analysis, feature extraction, error estimation, cluster analysis.
– Pattern Recognition Laboratory at Delft University of Technology

News Overviews Instructional Materials AI-Alerts Classics

Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration

Cao, Haoyu, Bao, Changcun, Liu, Chaohu, Chen, Huang, Yin, Kun, Liu, Hao, Liu, Yinsong, Jiang, Deqiang, Sun, Xing

arXiv.org Artificial IntelligenceSep-3-2023

We propose a novel end-to-end document understanding model called SeRum (SElective Region Understanding Model) for extracting meaningful information from document images, including document analysis, retrieval, and office automation. Unlike state-of-the-art approaches that rely on multi-stage technical schemes and are computationally expensive, SeRum converts document image understanding and recognition tasks into a local decoding process of the visual tokens of interest, using a content-aware token merge module. This mechanism enables the model to pay more attention to regions of interest generated by the query decoder, improving the model's effectiveness and speeding up the decoding speed of the generative scheme. We also designed several pre-training tasks to enhance the understanding and local awareness of the model. Experimental results demonstrate that SeRum achieves state-of-the-art performance on document understanding tasks and competitive results on text spotting tasks. SeRum represents a substantial advancement towards enabling efficient and effective end-to-end document understanding.

computational linguistic, information, serum, (15 more...)

arXiv.org Artificial Intelligence

2309.01131

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.14)
Europe > Portugal > Lisbon > Lisbon (0.04)
(15 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
(2 more...)

Add feedback

The HAPPY HEDGEHOG Project

Bendel, Oliver, Graf, Emanuel, Bollier, Kevin

arXiv.org Artificial IntelligenceAug-30-2023

Semi-autonomous machines, autonomous machines and robots inhabit closed, semi-closed and open environments, more structured environments like the household or more unstructured environments like cultural landscapes or the wilderness. There they encounter domestic animals, farm animals, working animals, and wild animals. These creatures could be disturbed, displaced, injured, or killed by the machines. Within the context of machine ethics and social robotics, the School of Business FHNW developed several design studies and prototypes for animal-friendly machines, which can be understood as moral and social machines in the spirit of these disciplines. In 2019-20, a team led by the main author developed a prototype robot lawnmower that can recognize hedgehogs, interrupt its work for them and thus protect them. Every year many of these animals die worldwide because of traditional service robots. HAPPY HEDGEHOG (HHH), as the invention is called, could be a solution to this problem. This article begins by providing an introduction to the background. Then it focuses on navigation (where the machine comes across certain objects that need to be recognized) and thermal and image recognition (with the help of machine learning) of the machine. It also presents obvious weaknesses and possible improvements. The results could be relevant for an industry that wants to market their products as animal-friendly machines.

hedgehog, robot, sensor, (16 more...)

arXiv.org Artificial Intelligence

2401.03358

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Germany > Hesse > Darmstadt Region > Wiesbaden (0.04)
Europe > Central Europe (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry:

Information Technology (0.47)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.35)
Information Technology > Artificial Intelligence > Robots > Robots in the Home (0.34)

Add feedback

Interactive Multi Interest Process Pattern Discovery

Vazifehdoostirani, Mozhgan, Genga, Laura, Lu, Xixi, Verhoeven, Rob, van Laarhoven, Hanneke, Dijkman, Remco

arXiv.org Artificial IntelligenceAug-28-2023

Existing PPDMs typically are unsupervised and focus on a single dimension of interest, such as discovering frequent patterns. We present an interactive multi-interest-driven framework for process pattern discovery aimed at identifying patterns that are optimal according to a multi-dimensional analysis goal. The proposed approach is iterative and interactive, thus taking experts' knowledge into account during the discovery process. The paper focuses on a concrete analysis goal, i.e., deriving process patterns that affect the process outcome. We evaluate the approach on real-world event logs in both interactive and fully automated settings. The approach extracted meaningful patterns validated by expert knowledge in the interactive setting. Patterns extracted in the automated settings consistently led to prediction performance comparable to or better than patterns derived considering single-interest dimensions without requiring user-defined thresholds.

evaluation, interest function, process pattern, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-41620-0_18

2308.14475

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Netherlands > North Brabant > Eindhoven (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)

Add feedback

DynamicISP: Dynamically Controlled Image Signal Processor for Image Recognition

Yoshimura, Masakazu, Otsuka, Junji, Irie, Atsushi, Ohashi, Takeshi

arXiv.org Artificial IntelligenceAug-27-2023

Image Signal Processors (ISPs) play important roles in image recognition tasks as well as in the perceptual quality of captured images. In most cases, experts make a lot of effort to manually tune many parameters of ISPs, but the parameters are sub-optimal. In the literature, two types of techniques have been actively studied: a machine learning-based parameter tuning technique and a DNN-based ISP technique. The former is lightweight but lacks expressive power. The latter has expressive power, but the computational cost is too heavy on edge devices. To solve these problems, we propose "DynamicISP," which consists of multiple classical ISP functions and dynamically controls the parameters of each frame according to the recognition result of the previous frame. We show our method successfully controls the parameters of multiple ISP functions and achieves state-of-the-art accuracy with low computational cost in single and multi-category object detection tasks.

artificial intelligence, machine learning, pattern recognition, (15 more...)

arXiv.org Artificial Intelligence

2211.01146

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Enhancing Bloodstain Analysis Through AI-Based Segmentation: Leveraging Segment Anything Model for Crime Scene Investigation

Dong, Zihan, Zhang, ZhengDong

arXiv.org Artificial IntelligenceAug-26-2023

Bloodstain pattern analysis plays a crucial role in crime scene investigations by providing valuable information through the study of unique blood patterns. Conventional image analysis methods, like Thresholding and Contrast, impose stringent requirements on the image background and is labor-intensive in the context of droplet image segmentation. The Segment Anything Model (SAM), a recently proposed method for extensive image recognition, is yet to be adequately assessed for its accuracy and efficiency on bloodstain image segmentation. This paper explores the application of pre-trained SAM and fine-tuned SAM on bloodstain image segmentation with diverse image backgrounds. Experiment results indicate that both pre-trained and fine-tuned SAM perform the bloodstain image segmentation task with satisfactory accuracy and efficiency, while fine-tuned SAM achieves an overall 2.2\% accuracy improvement than pre-trained SAM and 4.70\% acceleration in terms of speed for image recognition. Analysis of factors that influence bloodstain recognition is carried out. This research demonstrates the potential application of SAM on bloodstain image segmentation, showcasing the effectiveness of Artificial Intelligence application in criminology research. We release all code and demos at \url{https://github.com/Zdong104/Bloodstain_Analysis_Ai_Tool}

accuracy, efficiency, segmentation, (16 more...)

arXiv.org Artificial Intelligence

2308.13979

Country:

North America > United States > District of Columbia > Washington (0.05)
Oceania > Fiji (0.04)
North America > United States > Oklahoma (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.68)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.61)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Graph Edit Distance Learning via Different Attention

Lv, Jiaxi, Zhang, Liang, Huang, Yi, Huang, Jiancheng, Chen, Shifeng

arXiv.org Artificial IntelligenceAug-26-2023

Recently, more and more research has focused on using Graph Neural Networks (GNN) to solve the Graph Similarity Computation problem (GSC), i.e., computing the Graph Edit Distance (GED) between two graphs. These methods treat GSC as an end-to-end learnable task, and the core of their architecture is the feature fusion modules to interact with the features of two graphs. Existing methods consider that graph-level embedding is difficult to capture the differences in local small structures between two graphs, and thus perform fine-grained feature fusion on node-level embedding can improve the accuracy, but leads to greater time and memory consumption in the training and inference phases. However, this paper proposes a novel graph-level fusion module Different Attention (DiffAtt), and demonstrates that graph-level fusion embeddings can substantially outperform these complex node-level fusion embeddings. We posit that the relative difference structure of the two graphs plays an important role in calculating their GED values. To this end, DiffAtt uses the difference between two graph-level embeddings as an attentional mechanism to capture the graph structural difference of the two graphs. Based on DiffAtt, a new GSC method, named Graph Edit Distance Learning via Different Attention (REDRAFT), is proposed, and experimental results demonstrate that REDRAFT achieves state-of-the-art performance in 23 out of 25 metrics in five benchmark datasets. Especially on MSE, it respectively outperforms the second best by 19.9%, 48.8%, 29.1%, 31.6%, and 2.2%. Moreover, we propose a quantitative test Remaining Subgraph Alignment Test (RESAT) to verify that among all graph-level fusion modules, the fusion embedding generated by DiffAtt can best capture the structural differences between two graphs.

artificial intelligence, machine learning, pattern recognition, (16 more...)

arXiv.org Artificial Intelligence

2308.13871

Country: Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Education > Educational Setting > Online (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.93)

Add feedback

Time-to-Pattern: Information-Theoretic Unsupervised Learning for Scalable Time Series Summarization

Ghods, Alireza, Hoang, Trong Nghia, Cook, Diane

arXiv.org Artificial IntelligenceAug-25-2023

Data summarization is the process of generating interpretable and representative subsets from a dataset. Existing time series summarization approaches often search for recurring subsequences using a set of manually devised similarity functions to summarize the data. However, such approaches are fraught with limitations stemming from an exhaustive search coupled with a heuristic definition of series similarity. Such approaches affect the diversity and comprehensiveness of the generated data summaries. To mitigate these limitations, we introduce an approach to time series summarization, called Time-to-Pattern (T2P), which aims to find a set of diverse patterns that together encode the most salient information, following the notion of minimum description length. T2P is implemented as a deep generative model that learns informative embeddings of the discrete time series on a latent space specifically designed to be interpretable. Our synthetic and real-world experiments reveal that T2P discovers informative patterns, even in noisy and complex settings. Furthermore, our results also showcase the improved performance of T2P over previous work in pattern diversity and processing scalability, which conclusively demonstrate the algorithm's effectiveness for time series summarization.

artificial intelligence, machine learning, pattern recognition, (19 more...)

arXiv.org Artificial Intelligence

2308.13722

Country:

North America > United States > Washington > Whitman County > Pullman (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Government (0.68)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Nougat: Neural Optical Understanding for Academic Documents

Blecher, Lukas, Cucurull, Guillem, Scialom, Thomas, Stojnic, Robert

arXiv.org Artificial IntelligenceAug-25-2023

The majority of scientific knowledge is stored in books or published in scientific journals, most commonly in the Portable Document Format (PDF). Next to HTML, PDFs are the second most prominent data format on the internet, making up 2.4% of common crawl [1]. However, the information stored in these files is very difficult to extract into any other formats. This is especially true for highly specialized documents, such as scientific research papers, where the semantic information of mathematical expressions is lost. Existing Optical Character Recognition (OCR) engines, such as Tesseract OCR [2], excel at detecting and classifying individual characters and words in an image, but fail to understand the relationship between them due to their line-by-line approach. This means that they treat superscripts and subscripts in the same way as the surrounding text, which is a significant drawback for mathematical expressions. In mathematical notations like fractions, exponents, and matrices, relative positions of characters are crucial. Converting academic research papers into machine-readable text also enables accessibility and searchability of science as a whole. The information of millions of academic papers can not be fully accessed because they are locked behind an unreadable format.

machine learning, natural language, pattern recognition, (19 more...)

arXiv.org Artificial Intelligence

2308.13418

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
South America > Brazil > Paraná > Curitiba (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023

Zheng, Yu, Zhang, Yajun, Niu, Chuanying, Zhan, Yibin, Long, Yanhua, Xu, Dongxing

arXiv.org Artificial IntelligenceAug-23-2023

This report describes the UNISOUND submission for Track1 and Track2 of VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC 2023). We submit the same system on Track 1 and Track 2, which is trained with only VoxCeleb2-dev. Large-scale ResNet and RepVGG architectures are developed for the challenge. We propose a consistency-aware score calibration method, which leverages the stability of audio voiceprints in similarity score by a Consistency Measure Factor (CMF). CMF brings a huge performance boost in this challenge. Our final system is a fusion of six models and achieves the first place in Track 1 and second place in Track 2 of VoxSRC 2023. The minDCF of our submission is 0.0855 and the EER is 1.5880%.

machine learning, pattern recognition, recognition, (14 more...)

arXiv.org Artificial Intelligence

2308.12526

Country:

Asia > China > Shanghai > Shanghai (0.05)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Speech Recognition (0.64)

Add feedback

UTRNet: High-Resolution Urdu Text Recognition In Printed Documents

Rahman, Abdur, Ghosh, Arjun, Arora, Chetan

arXiv.org Artificial IntelligenceAug-23-2023

In this paper, we propose a novel approach to address the challenges of printed Urdu text recognition using high-resolution, multi-scale semantic feature extraction. Our proposed UTRNet architecture, a hybrid CNN-RNN model, demonstrates state-of-the-art performance on benchmark datasets. To address the limitations of previous works, which struggle to generalize to the intricacies of the Urdu script and the lack of sufficient annotated real-world data, we have introduced the UTRSet-Real, a large-scale annotated real-world dataset comprising over 11,000 lines and UTRSet-Synth, a synthetic dataset with 20,000 lines closely resembling real-world and made corrections to the ground truth of the existing IIITH dataset, making it a more reliable resource for future research. We also provide UrduDoc, a benchmark dataset for Urdu text line detection in scanned documents. Additionally, we have developed an online tool for end-to-end Urdu OCR from printed documents by integrating UTRNet with a text detection model. Our work not only addresses the current limitations of Urdu OCR but also paves the way for future research in this area and facilitates the continued advancement of Urdu OCR technology. The project page with source code, datasets, annotations, trained models, and online tool is available at abdur75648.github.io/UTRNet.

machine learning, pattern recognition, recognition, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-41734-4_19

2306.15782

Country:

Asia > India (0.14)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.65)

Add feedback