AITopics | Pattern Recognition

Collaborating Authors

Pattern Recognition

"... the research area that studies the operation and design of systems that recognize patterns in data." It includes statistical methods like discriminant analysis, feature extraction, error estimation, cluster analysis.
– Pattern Recognition Laboratory at Delft University of Technology

News Overviews Instructional Materials AI-Alerts Classics

Efficient Mixed-Type Wafer Defect Pattern Recognition Using Compact Deformable Convolutional Transformers

Shukla, Nitish

arXiv.org Artificial IntelligenceOct-16-2023

Manufacturing wafers is an intricate task involving thousands of steps. Defect Pattern Recognition (DPR) of wafer maps is crucial to find the root cause of the issue and further improving the yield in the wafer foundry. Mixed-type DPR is much more complicated compared to single-type DPR due to varied spatial features, the uncertainty of defects, and the number of defects present. To accurately predict the number of defects as well as the types of defects, we propose a novel compact deformable convolutional transformer (DC Transformer). Specifically, DC Transformer focuses on the global features present in the wafer map by virtue of learnable deformable kernels and multi-head attention to the global features. The proposed method succinctly models the internal relationship between the wafer maps and the defects. DC Transformer is evaluated on a real dataset containing 38 defect patterns. Experimental results show that DC Transformer performs exceptionally well in recognizing both single and mixed-type defects. The proposed method outperforms the current state of the models by a considerable margin

compact deformable convolutional transformer, mixed-type wafer defect pattern recognition

arXiv.org Artificial Intelligence

2303.13827

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.53)

Add feedback

Extending TrOCR for Text Localization-Free OCR of Full-Page Scanned Receipt Images

Zhang, Hongkuan, Whittaker, Edward, Kitagishi, Ikuo

arXiv.org Artificial IntelligenceOct-16-2023

Digitization of scanned receipts aims to extract text from receipt images and save it into structured documents. This is usually split into two sub-tasks: text localization and optical character recognition (OCR). Most existing OCR models only focus on the cropped text instance images, which require the bounding box information provided by a text region detection model. Introducing an additional detector to identify the text instance images in advance adds complexity, however instance-level OCR models have very low accuracy when processing the whole image for the document-level OCR, such as receipt images containing multiple text lines arranged in various layouts. To this end, we propose a localization-free document-level OCR model for transcribing all the characters in a receipt image into an ordered sequence end-to-end. Specifically, we finetune the pretrained instance-level model TrOCR with randomly cropped image chunks, and gradually increase the image chunk size to generalize the recognition ability from instance images to full-page images. In our experiments on the SROIE receipt OCR dataset, the model finetuned with our strategy achieved 64.4 F1-score and a 22.8% character error rate (CER), respectively, which outperforms the baseline results with 48.5 F1-score and 50.6% CER. The best model, which splits the full image into 15 equally sized chunks, gives 87.8 F1-score and 4.98% CER with minimal additional pre or post-processing of the output. Moreover, the characters in the generated document-level sequences are arranged in the reading order, which is practical for real-world applications.

receipt image, recognition, total, (16 more...)

arXiv.org Artificial Intelligence

2212.05525

Country: Asia (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Machine Learning for Urban Air Quality Analytics: A Survey

Han, Jindong, Zhang, Weijia, Liu, Hao, Xiong, Hui

arXiv.org Artificial IntelligenceOct-14-2023

The increasing air pollution poses an urgent global concern with far-reaching consequences, such as premature mortality and reduced crop yield, which significantly impact various aspects of our daily lives. Accurate and timely analysis of air pollution is crucial for understanding its underlying mechanisms and implementing necessary precautions to mitigate potential socio-economic losses. Traditional analytical methodologies, such as atmospheric modeling, heavily rely on domain expertise and often make simplified assumptions that may not be applicable to complex air pollution problems. In contrast, Machine Learning (ML) models are able to capture the intrinsic physical and chemical rules by automatically learning from a large amount of historical observational data, showing great promise in various air quality analytical tasks. In this article, we present a comprehensive survey of ML-based air quality analytics, following a roadmap spanning from data acquisition to pre-processing, and encompassing various analytical tasks such as pollution pattern mining, air quality inference, and forecasting. Moreover, we offer a systematic categorization and summary of existing methodologies and applications, while also providing a list of publicly available air quality datasets to ease the research in this direction. Finally, we identify several promising future research directions. This survey can serve as a valuable resource for professionals seeking suitable solutions for their specific challenges and advancing their research at the cutting edge.

air quality, information, proceedings, (13 more...)

arXiv.org Artificial Intelligence

2310.0962

Country:

North America > United States (0.92)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)
(8 more...)

Genre: Overview (1.00)

Industry:

Energy (0.68)
Transportation > Ground > Road (0.68)
Transportation > Infrastructure & Services (0.67)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Insightful analysis of historical sources at scales beyond human capabilities using unsupervised Machine Learning and XAI

Eberle, Oliver, Büttner, Jochen, El-Hajj, Hassan, Montavon, Grégoire, Müller, Klaus-Robert, Valleriani, Matteo

arXiv.org Artificial IntelligenceOct-13-2023

Historical materials are abundant. Yet, piecing together how human knowledge has evolved and spread both diachronically and synchronically remains a challenge that can so far only be very selectively addressed. The vast volume of materials precludes comprehensive studies, given the restricted number of human specialists. However, as large amounts of historical materials are now available in digital form there is a promising opportunity for AI-assisted historical analysis. In this work, we take a pivotal step towards analyzing vast historical corpora by employing innovative machine learning (ML) techniques, enabling in-depth historical insights on a grand scale. Our study centers on the evolution of knowledge within the `Sacrobosco Collection' -- a digitized collection of 359 early modern printed editions of textbooks on astronomy used at European universities between 1472 and 1650 -- roughly 76,000 pages, many of which contain astronomic, computational tables. An ML based analysis of these tables helps to unveil important facets of the spatio-temporal evolution of knowledge and innovation in the field of mathematical astronomy in the period, as taught at European universities.

climate zone, knowledge, representation, (16 more...)

arXiv.org Artificial Intelligence

2310.09091

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Portugal > Lisbon > Lisbon (0.14)
Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)
(35 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Summary/Review (0.92)

Industry:

Education > Educational Setting (0.67)
Health & Medicine > Therapeutic Area > Neurology (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.67)

Add feedback

Trustworthy Machine Learning

Mucsányi, Bálint, Kirchhof, Michael, Nguyen, Elisa, Rubinstein, Alexander, Oh, Seong Joon

arXiv.org Artificial IntelligenceOct-12-2023

As machine learning technology gets applied to actual products and solutions, new challenges have emerged. Models unexpectedly fail to generalize to small changes in the distribution, tend to be confident on novel data they have never seen, or cannot communicate the rationale behind their decisions effectively with the end users. Collectively, we face a trustworthiness issue with the current machine learning technology. This textbook on Trustworthy Machine Learning (TML) covers a theoretical and technical background of four key topics in TML: Out-of-Distribution Generalization, Explainability, Uncertainty Quantification, and Evaluation of Trustworthiness. We discuss important classical and contemporary research papers of the aforementioned fields and uncover and connect their underlying intuitions. The book evolved from the homonymous course at the University of T\"ubingen, first offered in the Winter Semester of 2022/23. It is meant to be a stand-alone product accompanied by code snippets and various pointers to further sources on topics of TML. The dedicated website of the book is https://trustworthyml.io/.

large language model, machine learning, pattern recognition, (30 more...)

arXiv.org Artificial Intelligence

2310.08215

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.13)
North America > United States > New Mexico > Lea County (0.13)
North America > United States > Louisiana (0.13)
(2 more...)

Genre:

Summary/Review (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
(3 more...)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
Government (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(12 more...)

Add feedback

Understanding Contrastive Learning Through the Lens of Margins

Rho, Daniel, Kim, TaeSoo, Park, Sooill, Park, Jaehyun, Park, JaeHan

arXiv.org Artificial IntelligenceOct-10-2023

Contrastive learning, along with its variations, has been a highly effective self-supervised learning method across diverse domains. Contrastive learning measures the distance between representations using cosine similarity and uses cross-entropy for representation learning. Within the same framework of cosine-similarity-based representation learning, margins have played a significant role in enhancing face and speaker recognition tasks. Interestingly, despite the shared reliance on the same similarity metrics and objective functions, contrastive learning has not actively adopted margins. Furthermore, decision-boundary-based explanations are the only ones that have been used to explain the effect of margins in contrastive learning. In this work, we propose a new perspective to understand the role of margins based on gradient analysis. Based on the new perspective, we analyze how margins affect gradients of contrastive learning and separate the effect into more elemental levels. We separately analyze each and provide possible directions for improving contrastive learning. Our experimental results demonstrate that emphasizing positive samples and scaling gradients depending on positive sample angles and logits are the keys to improving the generalization performance of contrastive learning in both seen and unseen datasets, and other factors can only marginally improve performance.

contrastive learning, exp, gradient, (10 more...)

arXiv.org Artificial Intelligence

2306.11526

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Learning Expected Appearances for Intraoperative Registration during Neurosurgery

Haouchine, Nazim, Dorent, Reuben, Juvekar, Parikshit, Torio, Erickson, Wells, William M. III, Kapur, Tina, Golby, Alexandra J., Frisken, Sarah

arXiv.org Artificial IntelligenceOct-2-2023

We present a novel method for intraoperative patient-to-image registration by learning Expected Appearances. Our method uses preoperative imaging to synthesize patient-specific expected views through a surgical microscope for a predicted range of transformations. Our method estimates the camera pose by minimizing the dissimilarity between the intraoperative 2D view through the optical microscope and the synthesized expected texture. In contrast to conventional methods, our approach transfers the processing tasks to the preoperative stage, reducing thereby the impact of low-resolution, distorted, and noisy intraoperative images, that often degrade the registration accuracy. We applied our method in the context of neuronavigation during brain surgery. We evaluated our approach on synthetic data and on retrospective data from 6 clinical cases. Our method outperformed state-of-the-art methods and achieved accuracies that met current clinical standards.

brain surface, registration, texture, (16 more...)

arXiv.org Artificial Intelligence

2310.01735

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > Promising Solution (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Surgery (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.35)

Add feedback

EXTRACTER: Efficient Texture Matching with Attention and Gradient Enhancing for Large Scale Image Super Resolution

Reyes-Saldana, Esteban, Rivera, Mariano

arXiv.org Artificial IntelligenceOct-2-2023

Recent Reference-Based image super-resolution (RefSR) has improved SOTA deep methods introducing attention mechanisms to enhance low-resolution images by transferring high-resolution textures from a reference high-resolution image. The main idea is to search for matches between patches using LR and Reference image pair in a feature space and merge them using deep architectures. However, existing methods lack the accurate search of textures. They divide images into as many patches as possible, resulting in inefficient memory usage, and cannot manage large images. Herein, we propose a deep search with a more efficient memory usage that reduces significantly the number of image patches and finds the $k$ most relevant texture match for each low-resolution patch over the high-resolution reference patches, resulting in an accurate texture match. We enhance the Super Resolution result adding gradient density information using a simple residual architecture showing competitive metrics results: PSNR and SSMI.

image super-resolution, reference image, texture, (11 more...)

arXiv.org Artificial Intelligence

2310.01379

Country:

North America > Mexico > Guanajuato (0.04)
North America > Mexico > Chihuahua > Ciudad Juárez (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.46)

Add feedback

Mining Java Memory Errors using Subjective Interesting Subgroups with Hierarchical Targets

Remil, Youcef, Bendimerad, Anes, Chambard, Mathieu, Mathonat, Romain, Plantevit, Marc, Kaytoue, Mehdi

arXiv.org Artificial IntelligenceOct-1-2023

Software applications, especially Enterprise Resource Planning (ERP) systems, are crucial to the day-to-day operations of many industries. Therefore, it is essential to maintain these systems effectively using tools that can identify, diagnose, and mitigate their incidents. One promising data-driven approach is the Subgroup Discovery (SD) technique, a data mining method that can automatically mine incident datasets and extract discriminant patterns to identify the root causes of issues. However, current SD solutions have limitations in handling complex target concepts with multiple attributes organized hierarchically. To illustrate this scenario, we examine the case of Java out-of-memory incidents among several possible applications. We have a dataset that describes these incidents, including their context and the types of Java objects occupying memory when it reaches saturation, with these types arranged hierarchically. This scenario inspires us to propose a novel Subgroup Discovery approach that can handle complex target concepts with hierarchies. To achieve this, we design a pattern syntax and a quality measure that ensure the identified subgroups are relevant, non-redundant, and resilient to noise. To achieve the desired quality measure, we use the Subjective Interestingness model that incorporates prior knowledge about the data and promotes patterns that are both informative and surprising relative to that knowledge. We apply this framework to investigate out-of-memory errors and demonstrate its usefulness in incident diagnosis. To validate the effectiveness of our approach and the quality of the identified patterns, we present an empirical study. The source code and data used in the evaluation are publicly accessible, ensuring transparency and reproducibility.

enterprise resource planning, machine learning, pattern recognition, (20 more...)

arXiv.org Artificial Intelligence

2310.00781

Country:

Europe > United Kingdom > North Sea > Central North Sea (0.04)
Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)

Genre:

Research Report (0.82)
Overview (0.67)

Technology:

Information Technology > Enterprise Applications > Enterprise Resource Planning (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.68)

Add feedback

Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition

Chaubey, Ashutosh, Sinha, Sparsh, Ghose, Susmita

arXiv.org Artificial IntelligenceSep-30-2023

Speaker identification systems are deployed in diverse environments, often different from the lab conditions on which they are trained and tested. In this paper, first, we show the problem of generalization using fixed thresholds (computed using EER metric) for imposter identification in unseen speaker recognition and then introduce a robust speaker-specific thresholding technique for better performance. Secondly, inspired by the recent use of meta-learning techniques in speaker verification, we propose an end-to-end meta-learning framework for imposter detection which decouples the problem of imposter detection from unseen speaker identification. Thus, unlike most prior works that use some heuristics to detect imposters, the proposed network learns to detect imposters by leveraging the utterances of the enrolled speakers. Furthermore, we show the efficacy of the proposed techniques on VoxCeleb1, VCTK and the FFSVC 2022 datasets, beating the baselines by up to 10%.

identification, speaker identification, utterance, (13 more...)

arXiv.org Artificial Intelligence

2306.00952

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech > Acoustic Processing (0.97)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Speech Recognition (0.63)

Add feedback