AITopics | Pattern Recognition

Collaborating Authors

Pattern Recognition

"... the research area that studies the operation and design of systems that recognize patterns in data." It includes statistical methods like discriminant analysis, feature extraction, error estimation, cluster analysis.
– Pattern Recognition Laboratory at Delft University of Technology

News Overviews Instructional Materials AI-Alerts Classics

Learning Generalized Non-Rigid Multimodal Biomedical Image Registration from Generic Point Set Data

Baum, Zachary MC, Ungi, Tamas, Schlenger, Christopher, Hu, Yipeng, Barratt, Dean C

arXiv.org Artificial IntelligenceJul-22-2022

Free Point Transformer (FPT) has been proposed as a data-driven, non-rigid point set registration approach using deep neural networks. As FPT does not assume constraints based on point vicinity or correspondence, it may be trained simply and in a flexible manner by minimizing an unsupervised loss based on the Chamfer Distance. This makes FPT amenable to real-world medical imaging applications where ground-truth deformations may be infeasible to obtain, or in scenarios where only a varying degree of completeness in the point sets to be aligned is available. To test the limit of the correspondence finding ability of FPT and its dependency on training data sets, this work explores the generalizability of the FPT from well-curated non-medical data sets to medical imaging data sets. First, we train FPT on the ModelNet40 dataset to demonstrate its effectiveness and the superior registration performance of FPT over iterative and learning-based point set registration methods. Second, we demonstrate superior performance in rigid and non-rigid registration and robustness to missing data. Last, we highlight the interesting generalizability of the ModelNet-trained FPT by registering reconstructed freehand ultrasound scans of the spine and generic spine models without additional training, whereby the average difference to the ground truth curvatures is 1.3 degrees, across 13 patients.

machine learning, pattern recognition, registration, (18 more...)

arXiv.org Artificial Intelligence

2207.10994

Country:

North America > Canada (0.04)
Europe > United Kingdom > England > Tyne and Wear > Sunderland (0.04)

Genre:

Instructional Material > Course Syllabus & Notes (0.68)
Research Report (0.64)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.41)

Add feedback

Webly Supervised Concept Expansion for General Purpose Vision Models

Kamath, Amita, Clark, Christopher, Gupta, Tanmay, Kolve, Eric, Hoiem, Derek, Kembhavi, Aniruddha

arXiv.org Artificial IntelligenceJul-20-2022

General Purpose Vision (GPV) systems are models that are designed to solve a wide array of visual tasks without requiring architectural changes. Today, GPVs primarily learn both skills and concepts from large fully supervised datasets. Scaling GPVs to tens of thousands of concepts by acquiring data to learn each concept for every skill quickly becomes prohibitive. This work presents an effective and inexpensive alternative: learn skills from supervised datasets, learn concepts from web image search, and leverage a key characteristic of GPVs: the ability to transfer visual knowledge across skills. We use a dataset of 1M+ images spanning 10k+ visual concepts to demonstrate webly-supervised concept expansion for two existing GPVs (GPV-1 and VL-T5) on 3 benchmarks: 5 Coco-based datasets (80 primary concepts), a newly curated series of 5 datasets based on the OpenImages and VisualGenome repositories ( 500 concepts), and the Web-derived dataset (10k+ concepts). We also propose a new architecture, GPV-2 that supports a variety of tasks -- from vision tasks like classification and localization to vision+language tasks like QA and captioning, to more niche ones like human-object interaction detection. GPV-2 benefits hugely from web data and outperforms GPV-1 and VL-T5 across these benchmarks. Our data, code, and web demo are available at https://prior.allenai.org/projects/gpv2.

category, computer vision, gpv-2, (11 more...)

arXiv.org Artificial Intelligence

2202.02317

Country:

North America > United States > Illinois (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.46)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

ALTO: A Large-Scale Dataset for UAV Visual Place Recognition and Localization

Cisneros, Ivan, Yin, Peng, Zhang, Ji, Choset, Howie, Scherer, Sebastian

arXiv.org Artificial IntelligenceJul-19-2022

We present the ALTO dataset, a vision-focused dataset for the development and benchmarking of Visual Place Recognition and Localization methods for Unmanned Aerial Vehicles. The dataset is composed of two long (approximately 150km and 260km) trajectories flown by a helicopter over Ohio and Pennsylvania, and it includes high precision GPS-INS ground truth location data, high precision accelerometer readings, laser altimeter readings, and RGB downward facing camera imagery. In addition, we provide reference imagery over the flight paths, which makes this dataset suitable for VPR benchmarking and other tasks common in Localization, such as image registration and visual odometry. To the author's knowledge, this is the largest real-world aerial-vehicle dataset of this kind. Our dataset is available at https://github.com/MetaSLAM/ALTO.

dataset, place recognition, visual place recognition, (12 more...)

arXiv.org Artificial Intelligence

2207.12317

Country:

North America > United States > Ohio (0.26)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.15)

Genre: Research Report (0.50)

Industry:

Aerospace & Defense > Aircraft (0.73)
Government > Regional Government > North America Government > United States Government (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.35)

Add feedback

Bluescape Launches Popsync for Collaborative Image Search Experience

#artificialintelligenceJul-15-2022, 14:05:45 GMT

Bluescape announced the launch of Popsync, a collaborative image search and curation experience that allows users to create with free and premium images from across the web within their Bluescape workspace. In one search, anyone can quickly view multiple libraries at once, including exclusive agency partnerships with Getty Images, iStock and Unsplash, along with Google Images and more, combining speed and creativity like never before. AI and ML News: Why SMBs Shouldn't Be Afraid of Artificial Intelligence (AI) "We are constantly striving to make our content more accessible to our customers where and when they need it and are thrilled to team up with Bluescape to provide creatives with a seamless and fast way to collaborate using visuals," said Peter Orlowsky, Senior Vice President of Strategic Development for Getty Images. "Popsync is a powerful way to search that brings millions of our premium images directly into a Bluescape customer's hands, helping them to explore ideas faster than ever before." "Images are the universal language, and Popsync represents a new frontier in image search," said Peter Jackson, CEO of Bluescape.

machine learning, pattern recognition, popsync, (7 more...)

#artificialintelligence

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.84)

Add feedback

Scene Text Recognition with Permuted Autoregressive Sequence Models

Bautista, Darwin, Atienza, Rowel

arXiv.org Artificial IntelligenceJul-14-2022

Context-aware STR methods typically use internal autoregressive (AR) language models (LM). Inherent limitations of AR models motivated two-stage methods which employ an external LM. The conditional independence of the external LM on the input image may cause it to erroneously rectify correct predictions, leading to significant inefficiencies. Our method, PARSeq, learns an ensemble of internal AR LMs with shared weights using Permutation Language Modeling. It unifies context-free non-AR and context-aware AR inference, and iterative refinement using bidirectional context. Using synthetic training data, PARSeq achieves state-of-the-art (SOTA) results in STR benchmarks (91.9% accuracy) and more challenging datasets. It establishes new SOTA results (96.0% accuracy) when trained on real data. PARSeq is optimal on accuracy vs parameter count, FLOPS, and latency because of its simple, unified structure and parallel token processing. Due to its extensive use of attention, it is robust on arbitrarily-oriented text which is common in real-world images. Code, pretrained weights, and data are available at: https://github.com/baudm/parseq.

machine learning, pattern recognition, recognition, (17 more...)

arXiv.org Artificial Intelligence

2207.06966

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Computer vision is primed for business value

#artificialintelligenceJul-13-2022, 12:41:44 GMT

Over the past few years, computer vision applications have become ubiquitous. From phones that recognize the faces of their users, to cars that drive themselves, to satellites that track ship movements, the value of computer vision has never been clear. But hardware shortages and labor disruptions in the pandemic's wake are challenging companies' ability to make good on the promise of computer vision, even as the pandemic itself has accelerated the potential of its use cases. Following is a look at how companies across a range of industries are deploying computer vision to improve and optimize key business processes, from retail fulfillment to health-care diagnostics. Computer vision is a field of artificial intelligence that is focused on processing images and videos to extract meaningful information.

computer vision, machine learning, pattern recognition, (12 more...)

#artificialintelligence

Country:

North America > United States > Texas (0.05)
North America > United States > Ohio (0.05)
North America > United States > New York (0.05)
(5 more...)

Industry:

Retail (0.97)
Transportation > Freight & Logistics Services (0.50)
Health & Medicine > Therapeutic Area (0.49)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.47)

Add feedback

Interactive Machine Learning: A State of the Art Review

Wondimu, Natnael A., Buche, Cédric, Visser, Ubbo

arXiv.org Artificial IntelligenceJul-13-2022

Machine learning has proved useful in many software disciplines, including computer vision, speech and audio processing, natural language processing, robotics and some other fields. However, its applicability has been significantly hampered due its black-box nature and significant resource consumption. Performance is achieved at the expense of enormous computational resource and usually compromising the robustness and trustworthiness of the model. Recent researches have been identifying a lack of interactivity as the prime source of these machine learning problems. Consequently, interactive machine learning (iML) has acquired increased attention of researchers on account of its human-in-the-loop modality and relatively efficient resource utilization. Thereby, a state-of-the-art review of interactive machine learning plays a vital role in easing the effort toward building human-centred models. In this paper, we provide a comprehensive analysis of the state-of-the-art of iML. We analyze salient research works using merit-oriented and application/task oriented mixed taxonomy. We use a bottom-up clustering approach to generate a taxonomy of iML research works. Research works on adversarial black-box attacks and corresponding iML based defense system, exploratory machine learning, resource constrained learning, and iML performance evaluation are analyzed under their corresponding theme in our merit-oriented taxonomy. We have further classified these research works into technical and sectoral categories. Finally, research opportunities that we believe are inspiring for future work in iML are discussed thoroughly.

iml, learning, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2207.06196

Country:

Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
Oceania > Australia > South Australia > Adelaide (0.04)
North America > United States > Florida > Miami-Dade County > Coral Gables (0.04)
(3 more...)

Genre:

Overview (0.93)
Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
(2 more...)

Add feedback

Developing a Component Comment Extractor from Product Reviews on E-Commerce Sites

Anda, Shogo, Kikuchi, Masato, Ozono, Tadachika

arXiv.org Artificial IntelligenceJul-13-2022

Consumers often read product reviews to inform their buying decision, as some consumers want to know a specific component of a product. However, because typical sentences on product reviews contain various details, users must identify sentences about components they want to know amongst the many reviews. Therefore, we aimed to develop a system that identifies and collects component and aspect information of products in sentences. Our BERT-based classifiers assign labels referring to components and aspects to sentences in reviews and extract sentences with comments on specific components and aspects. We determined proper labels based for the words identified through pattern matching from product reviews to create the training data. Because we could not use the words as labels, we carefully created labels covering the meanings of the words. However, the training data was imbalanced on component and aspect pairs. We introduced a data augmentation method using WordNet to reduce the bias. Our evaluation demonstrates that the system can determine labels for road bikes using pattern matching, covering more than 88\% of the indicators of components and aspects on e-commerce sites. Moreover, our data augmentation method can improve the-F1-measure on insufficient data from 0.66 to 0.76.

augmentation, data augmentation, training data, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IIAI-AAI55812.2022.00026

2207.05979

Country: Asia > Japan > Honshū > Tōhoku (0.04)

Genre: Research Report (0.83)

Industry: Information Technology > Services > e-Commerce Services (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.55)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.46)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.35)

Add feedback

Towards Personalized Healthcare in Cardiac Population: The Development of a Wearable ECG Monitoring System, an ECG Lossy Compression Schema, and a ResNet-Based AF Detector

Yi, Wei-Ying, Liu, Peng-Fei, Lo, Sheung-Lai, Chan, Ya-Fen, Zhou, Yu, Leung, Yee, Woo, Kam-Sang, Lee, Alex Pui-Wai, Chen, Jia-Min, Leung, Kwong-Sak

arXiv.org Artificial IntelligenceJul-11-2022

Cardiovascular diseases (CVDs) are the number one cause of death worldwide. While there is growing evidence that the atrial fibrillation (AF) has strong associations with various CVDs, this heart arrhythmia is usually diagnosed using electrocardiography (ECG) which is a risk-free, non-intrusive, and cost-efficient tool. Continuously and remotely monitoring the subjects' ECG information unlocks the potentials of prompt pre-diagnosis and timely pre-treatment of AF before the development of any life-threatening conditions/diseases. Ultimately, the CVDs associated mortality could be reduced. In this manuscript, the design and implementation of a personalized healthcare system embodying a wearable ECG device, a mobile application, and a back-end server are presented. This system continuously monitors the users' ECG information to provide personalized health warnings/feedbacks. The users are able to communicate with their paired health advisors through this system for remote diagnoses, interventions, etc. The implemented wearable ECG devices have been evaluated and showed excellent intra-consistency (CVRMS=5.5%), acceptable inter-consistency (CVRMS=12.1%), and negligible RR-interval errors (ARE<1.4%). To boost the battery life of the wearable devices, a lossy compression schema utilizing the quasi-periodic feature of ECG signals to achieve compression was proposed. Compared to the recognized schemata, it outperformed the others in terms of compression efficiency and distortion, and achieved at least 2x of CR at a certain PRD or RMSE for ECG signals from the MIT-BIH database. To enable automated AF diagnosis/screening in the proposed system, a ResNet-based AF detector was developed. For the ECG records from the 2017 PhysioNet CinC challenge, this AF detector obtained an average testing F1=85.10% and a best testing F1=87.31%, outperforming the state-of-the-art.

ecg signal, machine learning, pattern recognition, (22 more...)

arXiv.org Artificial Intelligence

2207.05138

Country:

North America > United States (0.27)
Europe > United Kingdom (0.27)
Asia > China > Hong Kong > Sha Tin (0.04)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Hardware (1.00)
Information Technology > Data Science > Data Quality > Data Transformation (1.00)
(6 more...)

Add feedback

BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval

Zhang, Wenqiao, Guo, Jiannan, Li, Mengze, Shi, Haochen, Zhang, Shengyu, Li, Juncheng, Tang, Siliang, Zhuang, Yueting

arXiv.org Artificial IntelligenceJul-9-2022

Content-Based Image Retrieval (CIR) aims to search for a target image by concurrently comprehending the composition of an example image and a complementary text, which potentially impacts a wide variety of real-world applications, such as internet search and fashion retrieval. In this scenario, the input image serves as an intuitive context and background for the search, while the corresponding language expressly requests new traits on how specific characteristics of the query image should be modified in order to get the intended target image. This task is challenging since it necessitates learning and understanding the composite image-text representation by incorporating cross-granular semantic updates. In this paper, we tackle this task by a novel \underline{\textbf{B}}ottom-up cr\underline{\textbf{O}}ss-modal \underline{\textbf{S}}emantic compo\underline{\textbf{S}}ition (\textbf{BOSS}) with Hybrid Counterfactual Training framework, which sheds new light on the CIR task by studying it from two previously overlooked perspectives: \emph{implicitly bottom-up composition of visiolinguistic representation} and \emph{explicitly fine-grained correspondence of query-target construction}. On the one hand, we leverage the implicit interaction and composition of cross-modal embeddings from the bottom local characteristics to the top global semantics, preserving and transforming the visual representation conditioned on language semantics in several continuous steps for effective target image search. On the other hand, we devise a hybrid counterfactual training strategy that can reduce the model's ambiguity for similar queries.

information retrieval, machine learning, pattern recognition, (17 more...)

arXiv.org Artificial Intelligence

2207.04211

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > Singapore > Central Region > Singapore (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
(3 more...)

Add feedback