AITopics | Pattern Recognition

Collaborating Authors

Pattern Recognition

"... the research area that studies the operation and design of systems that recognize patterns in data." It includes statistical methods like discriminant analysis, feature extraction, error estimation, cluster analysis.
– Pattern Recognition Laboratory at Delft University of Technology

News Overviews Instructional Materials AI-Alerts Classics

Multi-language Video Subtitle Dataset for Image-based Text Recognition

Singkhornart, Thanadol, Surinta, Olarik

arXiv.org Artificial IntelligenceNov-6-2024

The Multi-language Video Subtitle Dataset is a comprehensive collection designed to support research in text recognition across multiple languages. This dataset includes 4,224 subtitle images extracted from 24 videos sourced from online platforms. It features a wide variety of characters, including Thai consonants, vowels, tone marks, punctuation marks, numerals, Roman characters, and Arabic numerals. With 157 unique characters, the dataset provides a resource for addressing challenges in text recognition within complex backgrounds. It addresses the growing need for high-quality, multilingual text recognition data, particularly as videos with embedded subtitles become increasingly dominant on platforms like YouTube and Facebook. The variability in text length, font, and placement within these images adds complexity, offering a valuable resource for developing and evaluating deep learning models. The dataset facilitates accurate text transcription from video content while providing a foundation for improving computational efficiency in text recognition systems. As a result, it holds significant potential to drive advancements in research and innovation across various computer science disciplines, including artificial intelligence, deep learning, computer vision, and pattern recognition.

machine learning, pattern recognition, subtitle image, (14 more...)

arXiv.org Artificial Intelligence

2411.05043

Country: Asia > Thailand (0.05)

Genre: Research Report (0.51)

Industry: Education > Curriculum > Subject-Specific Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

International Scientific Report on the Safety of Advanced AI (Interim Report)

Bengio, Yoshua, Mindermann, Sören, Privitera, Daniel, Besiroglu, Tamay, Bommasani, Rishi, Casper, Stephen, Choi, Yejin, Goldfarb, Danielle, Heidari, Hoda, Khalatbari, Leila, Longpre, Shayne, Mavroudis, Vasilios, Mazeika, Mantas, Ng, Kwan Yee, Okolo, Chinasa T., Raji, Deborah, Skeadas, Theodora, Tramèr, Florian, Adekanmbi, Bayo, Christiano, Paul, Dalrymple, David, Dietterich, Thomas G., Felten, Edward, Fung, Pascale, Gourinchas, Pierre-Olivier, Jennings, Nick, Krause, Andreas, Liang, Percy, Ludermir, Teresa, Marda, Vidushi, Margetts, Helen, McDermid, John A., Narayanan, Arvind, Nelson, Alondra, Oh, Alice, Ramchurn, Gopal, Russell, Stuart, Schaake, Marietje, Song, Dawn, Soto, Alvaro, Tiedrich, Lee, Varoquaux, Gaël, Yao, Andrew, Zhang, Ya-Qin

arXiv.org Artificial IntelligenceNov-5-2024

I am honoured to be chairing the delivery of the inaugural International Scientific Report on Advanced AI Safety. I am proud to publish this interim report which is the culmination of huge efforts by many experts over the six months since the work was commissioned at the Bletchley Park AI Safety Summit in November 2023. We know that advanced AI is developing very rapidly, and that there is considerable uncertainty over how these advanced AI systems might affect how we live and work in the future. AI has tremendous potential to change our lives for the better, but it also poses risks of harm. That is why having this thorough analysis of the available scientific literature and expert opinion is essential. The more we know, the better equipped we are to shape our collective destiny.

large language model, machine learning, pattern recognition, (26 more...)

arXiv.org Artificial Intelligence

2412.05282

Country:

North America > United States (1.00)
Asia > Middle East (0.92)
Europe > United Kingdom > England > Buckinghamshire > Milton Keynes (0.24)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
(2 more...)

Industry:

Transportation (1.00)
Media > News (1.00)
Leisure & Entertainment (1.00)
(13 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
(13 more...)

Add feedback

Multi-modal deformable image registration using untrained neural networks

Nguyen, Quang Luong Nhat, Cao, Ruiming, Waller, Laura

arXiv.org Artificial IntelligenceNov-4-2024

Image registration techniques usually assume that the images to be registered are of a certain type (e.g. single- vs. multi-modal, 2D vs. 3D, rigid vs. deformable) and there lacks a general method that can work for data under all conditions. We propose a registration method that utilizes neural networks for image representation. Our method uses untrained networks with limited representation capacity as an implicit prior to guide for a good registration. Unlike previous approaches that are specialized for specific data types, our method handles both rigid and non-rigid, as well as single- and multi-modal registration, without requiring changes to the model or objective function. We have performed a comprehensive evaluation study using a variety of datasets and demonstrated promising performance.

machine learning, pattern recognition, registration, (19 more...)

arXiv.org Artificial Intelligence

2411.02672

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > Switzerland > Zürich > Zürich (0.05)

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.49)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.66)

Add feedback

CTPD: Cross-Modal Temporal Pattern Discovery for Enhanced Multimodal Electronic Health Records Analysis

Wang, Fuying, Wu, Feng, Tang, Yihan, Yu, Lequan

arXiv.org Artificial IntelligenceNov-1-2024

Integrating multimodal Electronic Health Records (EHR) data, such as numerical time series and free-text clinical reports, has great potential in predicting clinical outcomes. However, prior work has primarily focused on capturing temporal interactions within individual samples and fusing multimodal information, overlooking critical temporal patterns across patients. These patterns, such as trends in vital signs like abnormal heart rate or blood pressure, can indicate deteriorating health or an impending critical event. Similarly, clinical notes often contain textual descriptions that reflect these patterns. Identifying corresponding temporal patterns across different modalities is crucial for improving the accuracy of clinical outcome predictions, yet it remains a challenging task. To address this gap, we introduce a Cross-Modal Temporal Pattern Discovery (CTPD) framework, designed to efficiently extract meaningful cross-modal temporal patterns from multimodal EHR data. Our approach introduces shared initial temporal pattern representations which are refined using slot attention to generate temporal semantic embeddings. To ensure rich cross-modal temporal semantics in the learned patterns, we introduce a contrastive-based TPNCE loss for cross-modal alignment, along with two reconstruction losses to retain core information of each modality. Evaluations on two clinically critical tasks, 48-hour in-hospital mortality and 24-hour phenotype classification, using the MIMIC-III database demonstrate the superiority of our method over existing approaches.

machine learning, modality, pattern recognition, (17 more...)

arXiv.org Artificial Intelligence

2411.00696

Country:

Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > Experimental Study (0.87)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RPS: A Generic Reservoir Patterns Sampler

Diop, Lamine, Plantevit, Marc, Soulet, Arnaud

arXiv.org Artificial IntelligenceOct-31-2024

Efficient learning from streaming data is important for modern data analysis due to the continuous and rapid evolution of data streams. Despite significant advancements in stream pattern mining, challenges persist, particularly in managing complex data streams like sequential and weighted itemsets. While reservoir sampling serves as a fundamental method for randomly selecting fixed-size samples from data streams, its application to such complex patterns remains largely unexplored. In this study, we introduce an approach that harnesses a weighted reservoir to facilitate direct pattern sampling from streaming batch data, thus ensuring scalability and efficiency. We present a generic algorithm capable of addressing temporal biases and handling various pattern types, including sequential, weighted, and unweighted itemsets. Through comprehensive experiments conducted on real-world datasets, we evaluate the effectiveness of our method, showcasing its ability to construct accurate incremental online classifiers for sequential data. Our approach not only enables previously unusable online machine learning models for sequential data to achieve accuracy comparable to offline baselines but also represents significant progress in the development of incremental online sequential itemset classifiers.

data mining, machine learning, pattern recognition, (20 more...)

arXiv.org Artificial Intelligence

2411.00074

Genre: Research Report (1.00)

Industry:

Education > Educational Setting > Online (0.55)
Energy > Oil & Gas > Upstream (0.41)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.87)

Add feedback

Scalable Sampling for High Utility Patterns

Diop, Lamine, Plantevit, Marc

arXiv.org Artificial IntelligenceOct-30-2024

Discovering valuable insights from data through meaningful associations is a crucial task. However, it becomes challenging when trying to identify representative patterns in quantitative databases, especially with large datasets, as enumeration-based strategies struggle due to the vast search space involved. To tackle this challenge, output space sampling methods have emerged as a promising solution thanks to its ability to discover valuable patterns with reduced computational overhead. However, existing sampling methods often encounter limitations when dealing with large quantitative database, resulting in scalability-related challenges. In this work, we propose a novel high utility pattern sampling algorithm and its on-disk version both designed for large quantitative databases based on two original theorems. Our approach ensures both the interactivity required for user-centered methods and strong statistical guarantees through random sampling. Thanks to our method, users can instantly discover relevant and representative utility pattern, facilitating efficient exploration of the database within seconds. To demonstrate the interest of our approach, we present a compelling use case involving archaeological knowledge graph sub-profiles discovery. Experiments on semantic and none-semantic quantitative databases show that our approach outperforms the state-of-the art methods.

constraint, database, transaction, (13 more...)

arXiv.org Artificial Intelligence

2410.22964

Country:

Europe > France (0.04)
Asia > Singapore (0.04)

Genre: Research Report > Promising Solution (0.86)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.68)

Add feedback

Local and Global Graph Modeling with Edge-weighted Graph Attention Network for Handwritten Mathematical Expression Recognition

Xie, Yejing, Zanibbi, Richard, Mouchère, Harold

arXiv.org Artificial IntelligenceOct-24-2024

TEX), handwritten mathematical expressions offer greater ease of use for humans but pose a greater challenge for machine recognition due to variations in individual writing styles and writing habits. Handwritten Mathematical Expression Recognition (HMER), which involves converting handwritten math into markup language for easier computer processing and rendering, is a challenging promising field with various of potential applications. Compared to Optical Character Recognition (OCR), recognizing handwritten manuscripts is more challenging due to the wide variation in handwriting styles. HMER not only faces the common challenges of handwriting recognition but also has to deal with the added complexity of interpreting the 2D structure of mathematical expressions. According to different processing objective, HMER can be categorized into Online HMER and Offline HMER. Online HMER processes a sequence of temporal trajectories captured by digital devices like tablets and digital pens. Online data is segmented into individual strokes based on pen-down and pen-up interruption. While offline expressions are static images collected by scanner, camera or smartphone.

machine learning, pattern recognition, recognition, (15 more...)

arXiv.org Artificial Intelligence

2410.18555

Country:

Asia > South Korea (0.14)
Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.88)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.63)

Add feedback

Integrating Canonical Neural Units and Multi-Scale Training for Handwritten Text Recognition

Wang, Zi-Rui

arXiv.org Artificial IntelligenceOct-23-2024

The segmentation-free research efforts for addressing handwritten text recognition can be divided into three categories: connectionist temporal classification (CTC), hidden Markov model and encoder-decoder methods. In this paper, inspired by the above three modeling methods, we propose a new recognition network by using a novel three-dimensional (3D) attention module and global-local context information. Based on the feature maps of the last convolutional layer, a series of 3D blocks with different resolutions are split. Then, these 3D blocks are fed into the 3D attention module to generate sequential visual features. Finally, by integrating the visual features and the corresponding global-local context features, a well-designed representation can be obtained. Main canonical neural units including attention mechanisms, fully-connected layer, recurrent unit and convolutional layer are efficiently organized into a network and can be jointly trained by the CTC loss and the cross-entropy loss. Experiments on the latest Chinese handwritten text datasets (the SCUT-HCCDoc and the SCUT-EPT) and one English handwritten text dataset (the IAM) show that the proposed method can make a new milestone.

machine learning, pattern recognition, recognition, (18 more...)

arXiv.org Artificial Intelligence

2410.18374

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Spain (0.04)
Asia > China > Hainan Province > Haikou (0.04)
Asia > China > Chongqing Province > Chongqing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision > Handwriting Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.66)

Add feedback

Robust Loop Closure by Textual Cues in Challenging Environments

Jin, Tongxing, Nguyen, Thien-Minh, Xu, Xinhang, Yang, Yizhuo, Yuan, Shenghai, Li, Jianping, Xie, Lihua

arXiv.org Artificial IntelligenceOct-21-2024

Loop closure is an important task in robot navigation. However, existing methods mostly rely on some implicit or heuristic features of the environment, which can still fail to work in common environments such as corridors, tunnels, and warehouses. Indeed, navigating in such featureless, degenerative, and repetitive (FDR) environments would also pose a significant challenge even for humans, but explicit text cues in the surroundings often provide the best assistance. This inspires us to propose a multi-modal loop closure method based on explicit human-readable textual cues in FDR environments. Specifically, our approach first extracts scene text entities based on Optical Character Recognition (OCR), then creates a local map of text cues based on accurate LiDAR odometry and finally identifies loop closure events by a graph-theoretic scheme. Experiment results demonstrate that this approach has superior performance over existing methods that rely solely on visual and LiDAR sensors. To benefit the community, we release the source code and datasets at \url{https://github.com/TongxingJin/TXTLCD}.

artificial intelligence, machine learning, pattern recognition, (17 more...)

arXiv.org Artificial Intelligence

2410.15869

Country:

Asia > Singapore (0.04)
Europe > Greece (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.48)

Add feedback

Accelerated Sub-Image Search For Variable-Size Patches Identification Based On Virtual Time Series Transformation And Segmentation

Plessen, Mogens

arXiv.org Artificial IntelligenceOct-20-2024

This paper addresses two tasks: (i) fixed-size objects such as hay bales are to be identified in an aerial image for a given reference image of the object, and (ii) variable-size patches such as areas on fields requiring spot spraying or other handling are to be identified in an image for a given small-scale reference image. Both tasks are related. The second differs in that identified sub-images similar to the reference image are further clustered before patches contours are determined by solving a traveling salesman problem. Both tasks are complex in that the exact number of similar sub-images is not known a priori. The main discussion of this paper is presentation of an acceleration mechanism for sub-image search that is based on a transformation of an image to multivariate time series along the RGB-channels and subsequent segmentation to reduce the 2D search space in the image. Two variations of the acceleration mechanism are compared to exhaustive search on diverse synthetic and real-world images. Quantitatively, proposed method results in solve time reductions of up to 2 orders of magnitude, while qualitatively delivering comparative visual results. Proposed method is neural network-free and does not use any image pre-processing.

artificial intelligence, machine learning, pattern recognition, (19 more...)

arXiv.org Artificial Intelligence

2410.15425

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > Switzerland (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.64)
(2 more...)

Add feedback