If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."
However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …
Telecoms giant BT will begin work with a new visual artificial intelligence company to create products across the retail and the public sector. London-based Cortexica Vision Systems, which won at this year's BT Infinity Awards competition, will bring its AI video analysis technology and expertise to the company. "Cortexica's core technology, strong academic pedigree and track record of delivering applications for the real-world meant our decision was a simple one," said Colm O'Neill, MD Major Business and Public Sector for BT. "The BT Infinity Awards aim not only to showcase transformational tech for tomorrow's world, but to connect BT's customers with it and stimulate adoption in a range of sectors. "Past winners of the award have gone on to become trusted partners for many of our customers, and we hope Cortexica will do the same."
With the recent release of MXNet version 1.2.0, the new MXNet Scala Inference API was released. This release focuses on optimizing the developer experience for inference applications written in Scala. Scala is a general-purpose programming language that supports both functional programming and a strong static type system, and is used with high scale distributed processing with platforms such as Apache Spark. Now that you have been given the grand tour of the new Scala API, you're ready to try it out yourself. You will first need to setup your dev environment with mxnet-full package, and then you can try your hand at an image classification example and an object detection example (which we will demonstrate in the next post).
About 10 years ago, it would have been hard to believe that you could ask a Bluetooth speaker for a classic cheese soufflé recipe or take a picture of an object using your phone and find out exactly where to purchase it. These interactions have been primarily realized through advancements in machine learning AI. One of the biggest developments in AI over the past three years has been in the area of voice recognition and natural language processing and we're starting to see advancements in more complex human machine interaction in the form of image/video search. Forward-thinking businesses are already using this new form of machine learning AI image recognition to allow users to search for products using pictures to find the same or similar looks and outfits they stock. However, does this mean intelligent image search is the next big thing?
Despite the advancement of supervised image recognition algorithms, their de- pendence on the availability of labeled data and the rapid expansion of image categories raise the significant challenge of zero-shot learning. Zero-shot learn- ing (ZSL) aims to transfer knowledge from labeled classes into unlabeled classes to reduce human labeling effort. In this paper, we propose a novel self-training ensemble network model to address zero-shot image recognition. The ensemble network is built by learning multiple image classification functions with a shared feature extraction network but different label embedding representations, each of which facilitates information transfer to different subsets of unlabeled classes. A self-training framework is then deployed to iteratively label the most confident images in each unlabeled class with predicted pseudo-labels and update the ensem- ble network with the training data augmented by the pseudo-labels. The proposed model performs training on both labeled and unlabeled data. It can naturally bridge the domain shift problem in visual appearances and be extended to the generalized zero-shot learning scenario. We conduct experiments on multiple standard ZSL datasets and the empirical results demonstrate the efficacy of the proposed model.
During the MVP Summit this year and the Windows Developer Day, Microsoft has spoken a lot about WinML. From that moment on, I was trying to find some spare time to start playing with this. I finally managed to build a very simple UWP Console app that does image classification, using a ONNX file that I trained in the cloud. In this blogpost I'll show you exactly how I've built this. The resulting UWP Console app will take all images from the executing folder, classify them and will add the classification as a Tag to the metadata of the image.
In Machine Learning and Robotics, the semantic content of visual features is usually provided to the system by a human who interprets its content. On the contrary, strictly unsupervised approaches have difficulties relating the statistics of sensory inputs to their semantic content without also relying on prior knowledge introduced in the system. We proposed in this paper to tackle this problem from a sensorimotor perspective. In line with the Sensorimotor Contingencies Theory, we make the fundamental assumption that the semantic content of sensory inputs at least partially stems from the way an agent can actively transform it. We illustrate our approach by formalizing how simple visual features can induce invariants in a naive agent's sensorimotor experience, and evaluate it on a simple simulated visual system. Without any a priori knowledge about the way its sensorimotor information is encoded, we show how an agent can characterize the uniformity and edge-ness of the visual features it interacts with.
During the opening F8 2018 keynote, Facebook CEO Mark Zuckerberg showed off the company's latest Instagram updates: Spotify integration, AI-based anti-bullying comment filters, AR camera effects and four-way video chat. During the Day 2 keynote, Facebook revealed how your daily Instagram updates are giving its AI technology a deep-learning crash course in image recognition--one that's apparently made its AI even smarter than Google's at categorizing objects in photos. Facebook pulled this off, amazingly enough, by instructing its AI to read photo hashtags and interpret photos' subject matter. Using this strategy, called "weakly supervised training", Facebook's AI achieved a record 85.4% accuracy rating on an industry-wide test of image recognition, beating out Google's previous record. A Facebook Engineering blog post went into detail on the methods.
In the race to continue building more sophisticated AI deep learning models, Facebook has a secret weapon: billions of images on Instagram. In research the company is presenting today at F8, Facebook details how it took what amounted to billions of public Instagram photos that had been annotated by users with hashtags and used that data to train their own image recognition models. They relied on hundreds of GPUs running around the clock to parse the data, but were ultimately left with deep learning models that beat industry benchmarks, the best of which achieved 85.4 percent accuracy on ImageNet. If you've ever put a few hashtags onto an Instagram photo, you'll know doing so isn't exactly a research-grade process. There is generally some sort of method to why users tag an image with a specific hashtag; the challenge for Facebook was sorting what was relevant across billions of images.