Image Matching
Unsupervised Learning of Spoken Language with Visual Context
Harwath, David, Torralba, Antonio, Glass, James
Humans learn to speak before they can read or write, so why can't computers do the same? In this paper, we present a deep neural network model capable of rudimentary spoken language acquisition using untranscribed audio training data, whose only supervision comes in the form of contextually relevant visual images. We describe the collection of our data comprised of over 120,000 spoken audio captions for the Places image dataset and evaluate our model on an image search and annotation task. We also provide some visualizations which suggest that our model is learning to recognize meaningful words within the caption spectrograms.
Apple's AI Team Publishes First Research Paper Focused on Advanced Image Recognition
Earlier in December, Apple announced that it would begin allowing its artificial intelligence and machine learning researchers to publish and share their work in papers, slightly pulling back the curtain on the company's famously secretive creation processes. Now, just a few weeks later, the first of those papers has been published, focusing on Apple's work in the intelligent image recognition field. Titled "Learning from Simulated and Unsupervised Images through Adversarial Training," the paper describes a program that can intelligently decipher and understand digital images in a setting similar to the "Siri Intelligence" and facial recognition features introduced in Photos in iOS 10, but more advanced. In the research, Apple notes the downsides and upsides of using real images compared with that of "synthetic," or computer images. Annotations must be added to real images, an "expensive and time-consuming task" that requires a human workforce to individually label objects in a picture.
Machine learning will make sure no one steals your logo
A computer's ability to accurately identify images is a white whale for many technology companies, from Baidu to Google. One Australian startup has found a corner of the market to dominate, winning contracts with the European Union Intellectual Property Office (EUIPO) and IP Australia for algorithms that can detect and compare logos. SEE ALSO: Airbnb is getting into the airline booking disruption game with'Flights' TrademarkVision, which has support from Australia's CEA Startup Fund, uses machine learning to support image searches that can identify similar trademarks. Having a unique trademark or logo is vital, but many intellectual property registration bodies often require outdated forms of non-visual search that make comparison difficult. Australia, for example, relies on keywords, Europe on Vienna codes and the U.S. on design codes.
Shutterstock's Data Scientist Kevin Lester Talks Reverse Image Search
Stock photo company Shutterstock introduced reverse image search for desktop earlier this spring. This made it easy for users to search Shutterstock's website with an image, instead of using keywords. Shutterstock's data scientist Kevin Lester, who looks closely at the adoption of these new tools, was able to find out what patterns emerge from the data. In fact, Lester shared with IBTimes, that those who used reverse-image search for searches wound up making more downloads per search than those from a user with a text-based search. "We've found that users who performed at least one reverse image search prior to making a purchase with Shutterstock were 3.49 times more likely to make a subsequent purchase than those who did not," says Lester.
TrademarkVision uses machine learning to make finding logos as easy as a reverse image search
A company's logo is an important part of its identity, but the processes behind defining, registering, and protecting these trademarks is a convoluted and rather archaic one. A startup called TrademarkVision aims to simplify it by replacing that laborious and arcane process with what amounts to a machine-learning-powered reverse image search. This isn't in some lab, either: the EU just switched their whole image trademark system over to it. Most people probably haven't had to do many trademark and logo searches. Well, why don't you take the USPTO's version for a spin so you know what it's like? Try to find the Nike "Swoosh" or something.
Amazon launches new artificial intelligence services for developers: Image recognition, text-to-speech, Alexa NLP
Amazon today announced three new artificial intelligence-related toolkits for developers building apps on Amazon Web Services. At the company's AWS re:invent conference in Las Vegas, Amazon showed how developers can use three new services -- Amazon Lex, Amazon Polly, Amazon Rekognition -- to build artificial intelligence features into apps for platforms like Slack, Facebook Messenger, ZenDesk, and others. The idea is to let developers utilize the machine learning algorithms and technology that Amazon has already created for its own processes and services like Alexa. Instead of developing their own AI software, AWS customers can simply use an API call or the AWS Management Console to incorporate AI features into their own apps. AWS CEO Andy Jassy noted that Amazon has been building AI and machine learning technology for 20 years and said that there are now thousands of people "dedicated to AI in our business."
Amazon Rekognition Is An Image Recognition Service By Amazon
One of the basic features of artificial intelligence (AI) is the ability to recognize images and process them. Companies like Microsoft and Google have debuted tools to show how accurate their image recognition platforms are. Now it seems that Amazon wants in as well as they have announced Amazon Rekognition. This is an image recognition service that is part of a suite of deep-learning services that Amazon has recently announced for developers. For the most part it does what most image recognition services do, which is to identify human faces, identify emotions, and label objects just by looking at it.
Amazon launches new artificial intelligence services for developers: Image recognition, text-to-speech, Alexa NLP
Amazon today announced three new artificial intelligence-related toolkits for developers building apps on Amazon Web Services. At the company's AWS re:invent conference in Las Vegas, Amazon showed how developers can use three new services -- Amazon Lex, Amazon Polly, Amazon Rekognition -- to build artificial intelligence features into apps for platforms like Slack, Facebook Messenger, ZenDesk, and others. The idea is to let developers utilize the machine learning algorithms and technology that Amazon has already created for its own processes and services like Alexa. Instead of developing their own AI software, AWS customers can simply use an API call or the AWS Management Console to incorporate AI features into their own apps. AWS CEO Andy Jassy noted that Amazon has been building AI and machine learning technology for 20 years and said that there are now thousands of people "dedicated to AI in our business."
This Google-powered AI can identify your terrible doodles
As part of Google's slew of artificial intelligence announcements today, the company is releasing a number of AI web experiments powered by its cloud services that anyone can go and play with. One -- called Quick, Draw! -- gives you a prompt to draw an image of a written word or phrase in under 20 seconds with your mouse cursor in such a way that a neural network can identify it. Quick, Draw! is a great way to familiarize yourself with how neural networks work to identify objects and text in photos, which is one of the most common forms of AI-guided software techniques we see daily on platform's like Facebook and Google Photos. As you start to craft the doodle, Quick, Draw!'s software automaton will start yelling out words and phrases it thinks you're trying to illustrate. As you get closer to the finished product, the voice starts to become a good indication of how your drawing could be misinterpreted as something else. If you're on point, however, the neural network will hone in on the object and guess correctly.