AITopics | Pattern Recognition

Collaborating Authors

Pattern Recognition

"... the research area that studies the operation and design of systems that recognize patterns in data." It includes statistical methods like discriminant analysis, feature extraction, error estimation, cluster analysis.
– Pattern Recognition Laboratory at Delft University of Technology

News Overviews Instructional Materials AI-Alerts Classics

Attention for Image Registration (AiR): an unsupervised Transformer approach

Wang, Zihao, Delingette, Hervé

arXiv.org Artificial IntelligenceMay-5-2021

Image registration as an important basis in signal processing task often encounter the problem of stability and efficiency. Non-learning registration approaches rely on the optimization of the similarity metrics between the fix and moving images. Yet, those approaches are usually costly in both time and space complexity. The problem can be worse when the size of the image is large or the deformations between the images are severe. Recently, deep learning, or precisely saying, the convolutional neural network (CNN) based image registration methods have been widely investigated in the research community and show promising effectiveness to overcome the weakness of non-learning based methods. To explore the advanced learning approaches in image registration problem for solving practical issues, we present in this paper a method of introducing attention mechanism in deformable image registration problem. The proposed approach is based on learning the deformation field with a Transformer framework (AiR) that does not rely on the CNN but can be efficiently trained on GPGPU devices also. In a more vivid interpretation: we treat the image registration problem as the same as a language translation task and introducing a Transformer to tackle the problem. Our method learns an unsupervised generated deformation map and is tested on two benchmark datasets. The source code of the AiR will be released at Gitlab.

image registration, registration, transformer, (12 more...)

arXiv.org Artificial Intelligence

2105.02282

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Alpes-Maritimes > Nice (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.71)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PingAn-VCGroup's Solution for ICDAR 2021 Competition on Scientific Table Image Recognition to Latex

He, Yelin, Qi, Xianbiao, Ye, Jiaquan, Gao, Peng, Chen, Yihao, Li, Bingcong, Tang, Xin, Xiao, Rong

arXiv.org Artificial IntelligenceMay-4-2021

Recognizing a table image into a Latex code is challenging due to complexity and diversity of table structures and long sequence problem compared to traditional OCR. The challenge aims at assessing the ability of state-of-the-art methods to recognize scientific tables into LaTeX codes. In this competition, there are two sub-tasks with different levels of difficulty. Subtask I Table Structure Reconstruction is to reconstruct the structure of a table image into the form of LaTeX code but ignore the content of the table. Subtask II Table Content Reconstruction is to reconstruct the structure and the content of a table image simultaneously into the form of LaTeX code.

accuracy, competition, prediction accuracy, (12 more...)

arXiv.org Artificial Intelligence

2105.01846

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.42)

Add feedback

Improving Fairness in Speaker Recognition

Fenu, Gianni, Medda, Giacomo, Marras, Mirko, Meloni, Giacomo

arXiv.org Artificial IntelligenceApr-30-2021

The human voice conveys unique characteristics of an individual, making voice biometrics a key technology for verifying identities in various industries. Despite the impressive progress of speaker recognition systems in terms of accuracy, a number of ethical and legal concerns has been raised, specifically relating to the fairness of such systems. In this paper, we aim to explore the disparity in performance achieved by state-of-the-art deep speaker recognition systems, when different groups of individuals characterized by a common sensitive attribute (e.g., gender) are considered. In order to mitigate the unfairness we uncovered by means of an exploratory study, we investigate whether balancing the representation of the different groups of individuals in the training set can lead to a more equal treatment of these demographic groups. Experiments on two state-of-the-art neural architectures and a large-scale public dataset show that models trained with demographically-balanced training sets exhibit a fairer behavior on different groups, while still being accurate. Our study is expected to provide a solid basis for instilling beyond-accuracy objectives (e.g., fairness) in speaker recognition.

artificial intelligence, demographic group, neural network, (18 more...)

arXiv.org Artificial Intelligence

2104.14067

Country: Europe > Italy (0.15)

Genre: Research Report > New Finding (0.94)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (0.37)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.37)
Energy > Oil & Gas > Midstream (0.37)
Law (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

End-to-End Approach for Recognition of Historical Digit Strings

Zhao, Mengqiao, Hochuli, Andre G., Cheddad, Abbas

arXiv.org Artificial IntelligenceApr-28-2021

The plethora of digitalised historical document datasets released in recent years has rekindled interest in advancing the field of handwriting pattern recognition. In the same vein, a recently published data set, known as ARDIS, presents handwritten digits manually cropped from 15.000 scanned documents of Swedish church books and exhibiting various handwriting styles. To this end, we propose an end-to-end segmentation-free deep learning approach to handle this challenging ancient handwriting style of dates present in the ARDIS dataset (4-digits long strings). We show that with slight modifications in the VGG-16 deep model, the framework can achieve a recognition rate of 93.2%, resulting in a feasible solution free of heuristic methods, segmentation, and fusion methods. Moreover, the proposed approach outperforms the well-known CRNN method (a model widely applied in handwriting recognition tasks).

machine learning, pattern recognition, recognition, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-86334-0_39

2104.13666

Country:

South America > Brazil > Paraná > Curitiba (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Parallel Scale-wise Attention Network for Effective Scene Text Recognition

Sajid, Usman, Chow, Michael, Zhang, Jin, Kim, Taejoon, Wang, Guanghui

arXiv.org Artificial IntelligenceApr-25-2021

The paper proposes a new text recognition network for scene-text images. Many state-of-the-art methods employ the attention mechanism either in the text encoder or decoder for the text alignment. Although the encoder-based attention yields promising results, these schemes inherit noticeable limitations. They perform the feature extraction (FE) and visual attention (VA) sequentially, which bounds the attention mechanism to rely only on the FE final single-scale output. Moreover, the utilization of the attention process is limited by only applying it directly to the single scale feature-maps. To address these issues, we propose a new multi-scale and encoder-based attention network for text recognition that performs the multi-scale FE and VA in parallel. The multi-scale channels also undergo regular fusion with each other to develop the coordinated knowledge together. Quantitative evaluation and robustness analysis on the standard benchmarks demonstrate that the proposed network outperforms the state-of-the-art in most cases.

proceedings, recognition, text recognition, (14 more...)

arXiv.org Artificial Intelligence

2104.12076

Country:

North America > United States > Kansas > Douglas County > Lawrence (0.14)
North America > Canada > Ontario > Toronto (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.84)

Add feedback

Stock Forecast Based On a Predictive Algorithm

#artificialintelligenceApr-21-2021, 03:30:38 GMT

This forecast is part of the Revolut Stock Trading Package, one of I Know First's algorithmic trading tools. The full investment universe includes the most promising stocks presented on Revolut trading platform. Package Name: Revolut Stock Trading Recommended Positions: Long Forecast Length: 3 Months (1/19/21 – 4/19/21) I Know First Average: 17.09% This Revolut Stock Trading Package forecast had correctly predicted 10 out of 10 stock movements. The highest trade return came from IVZ, at 32.4%.

forecast, predictive algorithm, stock forecast, (5 more...)

#artificialintelligence

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.06)
North America > Bermuda > City of Hamilton > Hamilton (0.06)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Information Management (0.40)
Information Technology > Data Science (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.40)

Add feedback

TeLCoS: OnDevice Text Localization with Clustering of Script

Munjal, Rachit S, Goyal, Manoj, Moharir, Rutika, Moharana, Sukumar

arXiv.org Artificial IntelligenceApr-21-2021

Recent research in the field of text localization in a resource constrained environment has made extensive use of deep neural networks. Scene text localization and recognition on low-memory mobile devices have a wide range of applications including content extraction, image categorization and keyword based image search. For text recognition of multi-lingual localized text, the OCR systems require prior knowledge of the script of each text instance. This leads to word script identification being an essential step for text recognition. Most existing methods treat text localization, script identification and text recognition as three separate tasks. This makes script identification an overhead in the recognition pipeline. To reduce this overhead, we propose TeLCoS: OnDevice Text Localization with Clustering of Script, a multi-task dual branch lightweight CNN network that performs real-time on device Text Localization and High-level Script Clustering simultaneously. The network drastically reduces the number of calls to a separate script identification module, by grouping and identifying some majorly used scripts through a single feed-forward pass over the localization network. We also introduce a novel structural similarity based channel pruning mechanism to build an efficient network with only 1.15M parameters. Experiments on benchmark datasets suggest that our method achieves state-of-the-art performance, with execution latency of 60 ms for the entire pipeline on the Exynos 990 chipset device.

accuracy, script identification, text localization, (12 more...)

arXiv.org Artificial Intelligence

2104.08045

Country:

Asia > India > Karnataka > Bengaluru (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Finding Motifs in Knowledge Graphs using Compression

Bloem, Peter

arXiv.org Machine LearningApr-16-2021

We introduce a method to find network motifs in knowledge graphs. Network motifs are useful patterns or meaningful subunits of the graph that recur frequently. We extend the common definition of a network motif to coincide with a basic graph pattern. We introduce an approach, inspired by recent work for simple graphs, to induce these from a given knowledge graph, and show that the motifs found reflect the basic structure of the graph. Specifically, we show that in random graphs, no motifs are found, and that when we insert a motif artificially, it can be detected. Finally, we show the results of motif induction on three real-world knowledge graphs.

graph, knowledge graph, motif, (15 more...)

arXiv.org Machine Learning

2104.08163

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > Tennessee (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)

Add feedback

Meet Facebook's Powerful New Image Recognition SEER A.I.

#artificialintelligenceApr-14-2021, 08:36:33 GMT

If Facebook has an unofficial slogan, an equivalent to Google's "Don't Be Evil" or Apple's "Think Different," it is "Move Fast and Break Things." It means, at least in theory, that one should iterate to try news things and not be afraid of the possibility of failure. In 2021, however, with social media currently being blamed for a plethora of societal ills, the phrase should, perhaps, be modified to: "Move Fast and Fix Things." One of the many areas social media, not just Facebook, has been pilloried for is its spreading of certain images online. It's a challenging problem by any stretch of the imagination: Some 4,000 photo uploads are made to Facebook every single second.

facebook, learning, powerful new image recognition seer, (10 more...)

#artificialintelligence

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.44)

Add feedback

Image Recognition AI: Algorithms And Applications

#artificialintelligenceApr-14-2021, 08:10:32 GMT

This breakthrough does not really require someone to feed the information to the computer or be their eyes so to say. Because this new technique allows machines to interpret and categorize whatever they see in images or videos. In other words, computers now have their own eyes. Therefore, they work independently with the ability to recognize whatever is around them. Here the model will predict only one label per image. What this means that no matter the input or the diversity in the image, the machine will assign only a single label.

algorithm and application, breakthrough, image recognition ai

#artificialintelligence

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.40)

Add feedback