Goto

Collaborating Authors

 Pattern Recognition


How To Create An AI (Artificial Intelligence) Model

#artificialintelligence

Digital generated image of data. Lemonade is one of this year's hottest IPOs and a key reason for this is the company's heavy investments in AI (Artificial Intelligence). The company has used this technology to develop bots to handle the purchase of policies and the managing of claims. Then how does a company like this create AI models? Well, as should be no surprise, it is complex and susceptible to failure.


An Intro to AI Image Recognition and Image Generation

#artificialintelligence

Artificial intelligence, undoubtedly, is altering the ways we live, work, and even create. It enhances productivity, quality, and speed of work. Image recognition that used to be tedious work has now been performed by AI-enabled machines. The image-generating feature of artificial intelligence has opened ways for people to go in directions they have never heard of.


Real-time Interface Control with Motion Gesture Recognition based on Non-contact Capacitive Sensing

arXiv.org Artificial Intelligence

Capacitive sensing is a prominent technology that is cost-effective and low power consuming with fast recognition speed compared to existing sensing systems. On account of these advantages, Capacitive sensing has been widely studied and commercialized in the domains of touch sensing, localization, existence detection, and contact sensing interface application such as human-computer interaction. However, as a non-contact proximity sensing scheme is easily affected by the disturbance of peripheral objects or surroundings, it requires considerable sensitive data processing than contact sensing, limiting the use of its further utilization. In this paper, we propose a real-time interface control framework based on non-contact hand motion gesture recognition through processing the raw signals, detecting the electric field disturbance triggered by the hand gesture movements near the capacitive sensor using adaptive threshold, and extracting the significant signal frame, covering the authentic signal intervals with 98.8% detection rate and 98.4% frame correction rate. Through the GRU model trained with the extracted signal frame, we classify the 10 hand motion gesture types with 98.79% accuracy. The framework transmits the classification result and maneuvers the interface of the foreground process depending on the input. This study suggests the feasibility of intuitive interface technology, which accommodates the flexible interaction between human to machine similar to Natural User Interface, and uplifts the possibility of commercialization based on measuring the electric field disturbance through non-contact proximity sensing which is state-of-the-art sensing technology.


Challenges of Artificial Intelligence -- From Machine Learning and Computer Vision to Emotional Intelligence

arXiv.org Artificial Intelligence

Artificial intelligence (AI) has become a part of everyday conversation and our lives. It is considered as the new electricity that is revolutionizing the world. AI is heavily invested in both industry and academy. However, there is also a lot of hype in the current AI debate. AI based on so-called deep learning has achieved impressive results in many problems, but its limits are already visible. AI has been under research since the 1940s, and the industry has seen many ups and downs due to over-expectations and related disappointments that have followed. The purpose of this book is to give a realistic picture of AI, its history, its potential and limitations. We believe that AI is a helper, not a ruler of humans. We begin by describing what AI is and how it has evolved over the decades. After fundamentals, we explain the importance of massive data for the current mainstream of artificial intelligence. The most common representations for AI, methods, and machine learning are covered. In addition, the main application areas are introduced. Computer vision has been central to the development of AI. The book provides a general introduction to computer vision, and includes an exposure to the results and applications of our own research. Emotions are central to human intelligence, but little use has been made in AI. We present the basics of emotional intelligence and our own research on the topic. We discuss super-intelligence that transcends human understanding, explaining why such achievement seems impossible on the basis of present knowledge,and how AI could be improved. Finally, a summary is made of the current state of AI and what to do in the future. In the appendix, we look at the development of AI education, especially from the perspective of contents at our own university.


Top Face and Image Recognition Apps to Follow in December 2021

#artificialintelligence

With the development of technology, Image recognition has convincingly become an integral part of our life. There are diverse kinds of products and applications in the market now, intended to analyze and recognize specific objects in graphics. Biometrics is now a critical feature utilized by firms and even individuals for their security. This concept now has complete application and helps control false arrests, diagnose genetic disorders and reduce malware attacks, cybercrimes, etc. Each application varies with its performance, working methods, applications, etc. Users can choose the product based on our requirements.


Machine learning LEGO image recognition: Using virtual data and YOLOv3

#artificialintelligence

I have been working a lot with LEGO and 3D models lately. For my current project I am looking to build a LEGO image recognition program. My ideal scenario is to grab a handful of LEGO, toss them on the table, take a picture, and have the program catalog the pieces. The biggest challenge I encounter with any machine learning project is collecting and formatting the training data. I am pretty sure this is the biggest challenge everyone encounters with machine learning.


TransMorph: Transformer for unsupervised medical image registration

arXiv.org Artificial Intelligence

In the last decade, convolutional neural networks (ConvNets) have been a major focus of research in medical image analysis. However, the performances of ConvNets may be limited by a lack of explicit consideration of the long-range spatial relationships in an image. Recently Vision Transformer architectures have been proposed to address the shortcomings of ConvNets and have produced state-of-the-art performances in many medical imaging applications. Transformers may be a strong candidate for image registration because their unlimited receptive field enables a more precise comprehension of the spatial correspondence between moving and fixed images. Here, we present TransMorph, a hybrid Transformer-ConvNet model for volumetric medical image registration. This paper also presents diffeomorphic and Bayesian variants of TransMorph: the diffeomorphic variants ensure the topology-preserving deformations, and the Bayesian variant produces a well-calibrated registration uncertainty estimate. We extensively validated the proposed models using 3D medical images from three applications: inter-patient and atlas-to-patient brain MRI registration and phantom-to-CT registration. The proposed models are evaluated in comparison to a variety of existing registration methods and Transformer architectures. Qualitative and quantitative results demonstrate that the proposed Transformer-based model leads to a substantial performance improvement over the baseline methods, confirming the effectiveness of Transformers for medical image registration.


Turkish startups seek larger footprint in US market

#artificialintelligence

Seeking a firm foothold in the United States, Turkish startups are revving up their investments in the world's most prominent market. Following in the footsteps of gaming ventures that have placed themselves on top among most downloaded games, Turkey's ultrafast grocery delivery company Getir just recently launched operations in the U.S. only a few months after expanding into Europe. On the other hand, Vispera, which offers tech solutions in the field of machine vision and machine learning for the fast consumption and retail sectors across 25 countries, has decided to develop and strengthen its subsidiary in the U.S., Vispera Corp. The company also plans to further develop the strong business partnerships and product developments it has established in 2021. Offering a range of image recognition tools that can solve common issues in retail spaces like stocking inventory levels and more effective displays of current promotions, Vispera has announced it would complete a new investment round in the first months of 2022. The Istanbul-based company seeks to boost its activities in the U.S. through its subsidiary Vispera Corp.


Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition

arXiv.org Artificial Intelligence

Existing Scene Text Recognition (STR) methods typically use a language model to optimize the joint probability of the 1D character sequence predicted by a visual recognition (VR) model, which ignore the 2D spatial context of visual semantics within and between character instances, making them not generalize well to arbitrary shape scene text. To address this issue, we make the first attempt to perform textual reasoning based on visual semantics in this paper. Technically, given the character segmentation maps predicted by a VR model, we construct a subgraph for each instance, where nodes represent the pixels in it and edges are added between nodes based on their spatial similarity. Then, these subgraphs are sequentially connected by their root nodes and merged into a complete graph. Based on this graph, we devise a graph convolutional network for textual reasoning (GTR) by supervising it with a cross-entropy loss. GTR can be easily plugged in representative STR models to improve their performance owing to better textual reasoning. Specifically, we construct our model, namely S-GTR, by paralleling GTR to the language model in a segmentation-based STR baseline, which can effectively exploit the visual-linguistic complementarity via mutual learning. S-GTR sets new state-of-the-art on six challenging STR benchmarks and generalizes well to multi-linguistic datasets. Code is available at https://github.com/adeline-cs/GTR.


Object Recognition vs. Image Recognition

#artificialintelligence

Object recognition is a subfield of computer vision, artificial intelligence, and machine learning that seeks to recognize and identify the most prominent objects (i.e., people or things) in a digital image or video with AI models. Image recognition is also a subfield of AI and computer vision that seeks to recognize the high level contents of an image. If you're familiar with the domain of computer vision, you might think that object recognition sounds very similar to a related task: image recognition. However, there's a subtle yet important difference between image recognition and object recognition: The best way to illustrate the difference between object recognition and image recognition is through an example. Given a photograph of a soccer game, an image recognition model would return a single label such as "soccer game."