Goto

Collaborating Authors

 Pattern Recognition


Artificial Intelligence in Gestural Interfaces – Possible Near-Term Applications Emerj

#artificialintelligence

Gesture-based interfaces are applications that allow users to control devices using hand and other body parts. Today, they are found in devices used in home automation, shopping, consumer electronics, virtual reality and augmented reality gaming, navigation, and driving, among others. A study reported that the global gesture recognition in the retail market is projected to grow by 27.54 percent from 2018 to 2023. To date, some of the top producers of gestural interface products include Intel, Apple, Microsoft, and Google. According to research titled Hand Gesture Recognition Using Computer Vision, gesture recognition is done in two ways: data glove sensor devices that transform hand and finger motions into digital data, and computer vision which uses a camera. The second method may let humans interact more naturally with machines because it leaves their hands free to move.


Recurrent Registration Neural Networks for Deformable Image Registration

arXiv.org Machine Learning

Parametric spatial transformation models have been successfully applied to image registration tasks. In such models, the transformation of interest is parameterized by a fixed set of basis functions as for example B-splines. Each basis function is located on a fixed regular grid position among the image domain, because the transformation of interest is not known in advance. As a consequence, not all basis functions will necessarily contribute to the final transformation which results in a non-compact representation of the transformation. We reformulate the pairwise registration problem as a recursive sequence of successive alignments. For each element in the sequence, a local deformation defined by its position, shape, and weight is computed by our recurrent registration neural network. The sum of all local deformations yield the final spatial alignment of both images. Formulating the registration problem in this way allows the network to detect non-aligned regions in the images and to learn how to locally refine the registration properly. In contrast to current non-sequence-based registration methods, our approach iteratively applies local spatial deformations to the images until the desired registration accuracy is achieved. We trained our network on 2D magnetic resonance images of the lung and compared our method to a standard parametric B-spline registration. The experiments show, that our method performs on par for the accuracy but yields a more compact representation of the transformation. Furthermore, we achieve a speedup of around 15 compared to the B-spline registration.


Sparse Representation Classification via Screening for Graphs

arXiv.org Machine Learning

The sparse representation classifier (SRC) is shown to work well for image recognition problems that satisfy a subspace assumption. In this paper we propose a new implementation of SRC via screening, establish its equivalence to the original SRC under regularity conditions, and prove its classification consistency for random graphs drawn from stochastic blockmodels. The results are demonstrated via simulations and real data experiments, where the new algorithm achieves comparable numerical performance but significantly faster.


Using Google Vision AI's Reverse Image Search To Richly Catalog Television News

#artificialintelligence

Deep learning has revolutionized the machine understanding of imagery. Yet today's image recognition models are still limited by the availability of large annotated training datasets upon which to build their libraries of recognized objects and activities. To address this, Google's Vision AI API expands its native catalog of around 10,000 visually recognized objects and activities with the ability to perform the equivalent of a reverse Google Images search across the open Web and tally up the top topics used to caption the given image everywhere it has previously appeared, lending unprecedentedly rich context and understanding, even yielding unique labels for breaking news events. What might this process yield for a week of television news? Google's Vision AI API represents a unique hybrid between traditional deep learning-based image labeling based on a library of previously trained models and the ability to leverage the open Web to annotate images based on the most common topics visually similar images are captioned with. Using its Web Entities feature, the Vision AI API performs what amounts to a reverse Google Images search over the open Web, identifying images across the entire Web that look most similar to the given image.


Statistically Significant Discriminative Patterns Searching

arXiv.org Machine Learning

Discriminative pattern mining is an essential task of data mining. This task aims to discover patterns which occur more frequently in a class than other classes in a class-labeled dataset. This type of patterns is valuable in various domains such as bioinformatics, data classification. In this paper, we propose a novel algorithm, named SSDPS, to discover patterns in two-class datasets. The SSDPS algorithm owes its efficiency to an original enumeration strategy of the patterns, which allows to exploit some degrees of anti-monotonicity on the measures of discriminance and statistical significance. Experimental results demonstrate that the performance of the SSDPS algorithm is better than others. In addition, the number of generated patterns is much less than the number of the other algorithms. Experiment on real data also shows that SSDPS efficiently detects multiple SNPs combinations in genetic data.


Building a Chat Bot With Image Recognition and OCR

#artificialintelligence

In part 1 of this series, we gave our bot the ability to detect sentiment from text and respond accordingly. But that's about all it can do, and admittedly quite boring. Of course, in a real chat, we often send a multitude of media: from text, images, videos, gifs, to anything else. So in this, our next step in our journey, let's give our bot vision. The goal of this tutorial is to allow our bot to receive images, reply to them, and eventually give us a crude description of the main object in said image.


Building a Chat Bot With Image Recognition and OCR

#artificialintelligence

In part 1 of this series, we gave our bot the ability to detect sentiment from text and respond accordingly. But that's about all it can do, and admittedly quite boring. Of course, in a real chat, we often send a multitude of media: from text, images, videos, gifs, to anything else. So in this, our next step in our journey, let's give our bot vision. The goal of this tutorial is to allow our bot to receive images, reply to them, and eventually give us a crude description of the main object in said image.


EFFORTLESS IMAGE SCANNING

#artificialintelligence

Satisfying the consumers is the most important thing of any business advancements and that's what we helped our beverage partner achieve using Machine Learning. Image recognition is widely being accepted in the FMCG sector to eliminate poor management, supply and storage-related issues of the products. We are living in a mobile era relying heavily on apps for every other thing from ordering groceries to professional work to booking cabs and tickets. Still, for participating in contests, consumers have to manually send SMS for making their entries. What if there's a mobile or web app which is fast, small, accurate and can remove this manual work of sending SMS just by scanning an image!


r/deeplearning - Is there any framework recommended to start with text to image recognition?

#artificialintelligence

Hey! I'm new to this community and also I'm working on a project that might have a good scenario to implement text recognition in order to output images for them. Please, if anyone knows where I could start or which tools I could try it would be great!


Kernel Mean Embedding Based Hypothesis Tests for Comparing Spatial Point Patterns

arXiv.org Machine Learning

This paper introduces an approach for detecting differences in the first-order structures of spatial point patterns. The proposed approach leverages the kernel mean embedding in a novel way by introducing its approximate version tailored to spatial point processes. While the original embedding is infinite-dimensional and implicit, our approximate embedding is finite-dimensional and comes with explicit closed-form formulas. With its help we reduce the pattern comparison problem to the comparison of means in the Euclidean space. Hypothesis testing is based on conducting $t$-tests on each dimension of the embedding and combining the resulting $p$-values using the harmonic mean $p$-value combination technique. The main advantages of the proposed approach are that it can be applied to both single and replicated pattern comparisons, and that neither bootstrap nor permutation procedures are needed to obtain or calibrate the $p$-values. Our experiments show that the resulting tests are powerful and the $p$-values are well-calibrated; two applications to real world data are presented.