Image Matching
Facebook uses AI to help blind people 'see' the site: Feature uses audio and image recognition to describe scenes in photos
Many of us may bemoan the constant photos of babies, food or sunsets on Facebook, but blind users don't have that luxury. Until now, visually impaired people on the site could only use audio descriptions to explain they were'looking' at a photo, without any other details. Now, Facebook has started using artificial intelligence and image recognition to reveal what is shown in these photos. Until now, visually impaired people on Facebook could only use audio descriptions to explain they were'looking' at a photo, without any other details. Now, Facebook has started using artificial intelligence and image recognition to reveal what is shown in these photos (examples pictured).
Shutterstock's reverse image search promises a gentler side of AI
For designers and photographers, selecting and laying out photos is often subjective, requiring a keen sense of color and composition. Using a computer algorithm, the stock footage site Shutterstock hopes to make that process easier. It now offers a reverse image search tool that analyzes the pixels in a photo and returns images that are similar in "look and feel" to the original without requiring a user to type in keywords to search. Dragging a photo of a stained-glass cathedral window into the search box, the company demonstrates in a video, produces a series of related images that more closely match the original in color and composition. The new search engine works by using a customized convolutional neural network, a type of machine learning tool that is modeled on how the brain's visual cortex, especially that of animals, processes images.
Artificial Intelligence Startup Funded for Patented Image Recognition Breakthrough by State of
The platform will be the integral part of Image Search Engine for Image Referral Network and Image Ad Network, to automate generation and placement of highly-relevant targeted ads based on images in a large scale for the first time in the industry. ZAC's AI Discovery platform can also be used for other types of images, data, or objects, e.g., clothing, purse, accessories, medical images, satellite images, and biometrics. ZAC has an impressive team of scientists and developers. The software development is headed by Saied Tadayon, a veteran software developer and scientist, who got PhD from Cornell at age 23. One of ZAC's inventors is Prof. Lotfi A. Zadeh ("The Father of Fuzzy Logic"), a pioneer computer scientist at U.C. Berkeley.
Google Cloud Machine Learning is sailing into mainstream
Google had an announcement that means strictly business for its push to be known as a leader in cloud services. "Today [Wednesday], "we've taken a major stride forward with the announcement of a new product family: Cloud Machine Learning." The move is all about taking Cloud Machine Learning mainstream, "giving data scientists and developers a way to build a new class of intelligent applications," according to a post from Fausto Ibarra, director, product management, Google Cloud Platform. Blair Hanley Frank of IDG News Service said in so doing, that "Google is making it easier for businesses to take advantage of the machine learning revolution with a new product for building models that predict the future." Cade Metz in Wired said "the company unveiled a new family of cloud computing services that allow any developer or business to use the machine learning technologies that power some of Google's most powerful services." Basically, as Robert Hof in SiliconANGLE put it, "Now that Google has infused the brand of artificial intelligence known as machine learning into everything from search to speech recognition, it's tossing the technology it views as tech's next big wave into the public domain." The announcement was part of events at a two day conference in San Francisco, namely the Google Cloud Platform (GCP) Next 2016. The company said the Cloud Machine Learning provides machine learning services, with pre-trained models and platform so that a person can generate his or her own tailored models. Major Google applications use Cloud Machine Learning, including Photos (image search), the Google app (voice search), Translate, and Inbox (Smart Reply)--but now their platform is available as a cloud service for business applications. Google is not shy about blowing its own horn when it comes to machine learning technology: Compared to other large scale deep learning systems, said Google, "Our neural net-based ML platform has better training performance and increased accuracy." This announcement may be filed under strategy. Robert Hof made the observation: "Google made it clear that it intends to pitch the Cloud Machine Learning services announcement as a key differentiator.
Google Working On Gesture-Based Keyboard For iPhone With GIF, Image And Search Functions: Report
Google may be looking to ensure its control of the world's search market by building a gesture-based smartphone and tablet keyboard designed to work with Apple's iPhone and iPad, which has integrated search functionality, according to sources speaking to the Verge. According to the report, Google employees have been testing the new keyboard for several months already though the search giant has yet to decide when or if it will release the keyboard to Apple's App Store. With the introduction of iOS 8 in 2014, Apple finally opened up its software to allow users install third-party keyboards to replace the stock iOS version. Google's keyboard, like the stock Android keyboard, allows for gesture typing, a feature that allows users to swipe their finger from one letter to the next allowing Google to then guess what word you want based on the shape of the gesture you have made. As well as this gesture feature, Google is said to be integrating GIFs and images directly into the keyboard which would be powered by Google's own image search.
Manitest: Are classifiers really invariant?
Fawzi, Alhussein, Frossard, Pascal
Invariance to geometric transformations is a highly desirable property of automatic classifiers in many image recognition tasks. Nevertheless, it is unclear to which extent state-of-the-art classifiers are invariant to basic transformations such as rotations and translations. This is mainly due to the lack of general methods that properly measure such an invariance. In this paper, we propose a rigorous and systematic approach for quantifying the invariance to geometric transformations of any classifier. Our key idea is to cast the problem of assessing a classifier's invariance as the computation of geodesics along the manifold of transformed images. We propose the Manitest method, built on the efficient Fast Marching algorithm to compute the invariance of classifiers. Our new method quantifies in particular the importance of data augmentation for learning invariance from data, and the increased invariance of convolutional neural networks with depth. We foresee that the proposed generic tool for measuring invariance to a large class of geometric transformations and arbitrary classifiers will have many applications for evaluating and comparing classifiers based on their invariance, and help improving the invariance of existing classifiers.
Prajna: Towards Recognizing Whatever You Want from Images without Image Labeling
Hua, Xian-Sheng (Microsoft Research) | Li, Jin (Microsoft Research)
With the advances in distributed computation, machine learn-ing and deep neural networks, we enter into an era that it is possible to build a real world image recognition system. There are three essential components to build a real-world image recognition system: 1) creating representative features, 2) de-signing powerful learning approaches, and 3) identifying massive training data. While extensive researches have been done on the first two aspects, much less attention has been paid on the third. In this paper, we present an end-to-end Web knowledge discovery system, Prajna. Starting from an arbi-trary set of entities as inputs, Prajna automatically crawls im-ages from multiple sources, identifies images that have relia-bly labeled, trains models and build a recognition system that is capable of recognizing any new images of the entity set. Due to the high cost of manual data labeling, leveraging the massive yet noisy data on the Internet is a natural idea, but the practical engineering aspect is highly challenging. Prajna fo-cuses on separating reliable training data from extensive noisy data, which is a key to the capability of extending an image recognition system to support arbitrary entities. In this paper, we will analyze the intrinsic characteristics of Internet image data, and find ways to mine accurate and informative infor-mation from those data to build a training set, which is then used to train image recognition models. Prajna is capable of automatically building an image recognition system for those entities as long as we can collect sufficient number of images of the entities on the Web.
Sub-Selective Quantization for Large-Scale Image Search
Li, Yeqing (University of Texas at Arlington) | Chen, Chen (University of Texas at Arlington) | Liu, Wei (IBM T. J. Watson Research Center) | Huang, Junzhou (University of Texas at Arlington)
Recently with the explosive growth of visual content on the Internet, large-scale image search has attracted intensive attention. It has been shown that mapping highdimensional image descriptors to compact binary codes can lead to considerable efficiency gains in both storage and similarity computation of images. However, most existing methods still suffer from expensive training devoted to large-scale binary code learning. To address this issue, we propose a sub-selection based matrix manipulation algorithm which can significantly reduce the computational cost of code learning. As case studies, we apply the sub-selection algorithm to two popular quantization techniques PCA Quantization (PCAQ) and Iterative Quantization (ITQ). Crucially, we can justify the resulting sub-selective quantization by proving its theoretic properties. Extensive experiments are carried out on three image benchmarks with up to one million samples, corroborating the efficacy of the sub-selective quantization method in terms of image retrieval.
Distributed High Dimensional Information Theoretical Image Registration via Random Projections
Szabo, Zoltan, Lorincz, Andras
However, the estimation of these quantities is computationally intensive in high dimensions. On the other hand, consistent estimation from pairwise distances of the sample points is possible, which suits random projection(RP) based low dimensional embeddings. We adapt the RP technique to this task by means of a simple ensemble method. To the best of our knowledge, this is the first distributed, RP based information theoretical image registration approach. The efficiency of the method is demonstrated through numerical examples. Keywords: random projection, information theoretical image registration, high dimensional features, distributed solution 1. Introduction Machine learning methods are notoriously limited by the high dimensional nature of the data. This problem may be alleviated via the random projection (RP) technique, which has been successfully applied, e.g., in the fields of
Examples of Artificial Perceptions in Optical Character Recognition and Iris Recognition
Noaica, Cristina M., Badea, Robert, Motoc, Iulia M., Ghica, Claudiu G., Rosoiu, Alin C., Popescu-Bodorin, Nicolaie
This paper assumes the hypothesis that human learning is perception based, and consequently, the learning process and perceptions should not be represented and investigated independently or modeled in different simulation spaces. In order to keep the analogy between the artificial and human learning, the former is assumed here as being based on the artificial perception. Hence, instead of choosing to apply or develop a Computational Theory of (human) Perceptions, we choose to mirror the human perceptions in a numeric (computational) space as artificial perceptions and to analyze the interdependence between artificial learning and artificial perception in the same numeric space, using one of the simplest tools of Artificial Intelligence and Soft Computing, namely the perceptrons. As practical applications, we choose to work around two examples: Optical Character Recognition and Iris Recognition. In both cases a simple Turing test shows that artificial perceptions of the difference between two characters and between two irides are fuzzy, whereas the corresponding human perceptions are, in fact, crisp.