Artificial intelligence (AI) researchers are unable to explain exactly how deep learning algorithms arrive at their conclusions. Deep learning is complex by nature, but that does not excuse the pursuit of seeking clarity and understanding of black-box decision making. The quality of a machine learning algorithm requires some level of transparency and an understanding of how a decision was made--this impacts the generalizability of the algorithm and the reliability of the output. Recently in March 2019, researchers from the Fraunhofer Heinrich Hertz Institute, Technische Universität Berlin, Singapore University of Technology and Design, Korea University, and Max Planck Institut für Informatik, published in Nature Communications a method of validating the behavior of nonlinear machine learning in order to better assess the quality of the learning system. The research team of Klaus-Robert Müller, Wojciech Samek, Grégoire Montavon, Alexander Binder, Stephan Wäldchen, and Sebastian Lapuschkin discovered that various AI systems using what psychologists would characterize as a "Clever Hans" type of decision-based on correlation.
We study scalable and uniform understanding of facts in images. Existing visual recognition systems are typically modeled differently for each fact type such as objects, actions, and interactions. We propose a setting where all these facts can be modeled simultaneously with a capacity to understand an unbounded number of facts in a structured way. The training data comes as structured facts in images, including (1) objects (e.g., <boy>), (2) attributes (e.g., <boy, tall>), (3) actions (e.g., <boy, playing>), and (4) interactions (e.g., <boy, riding, a horse >). Each fact has a semantic language view (e.g., < boy, playing>) and a visual view (an image with this fact). We show that learning visual facts in a structured way enables not only a uniform but also generalizable visual understanding. We propose and investigate recent and strong approaches from the multiview learning literature and also introduce two learning representation models as potential baselines. We applied the investigated methods on several datasets that we augmented with structured facts and a large scale dataset of more than 202,000 facts and 814,000 images. Our experiments show the advantage of relating facts by the structure by the proposed models compared to the designed baselines on bidirectional fact retrieval.
There are several things holding back our use of deep learning methods and chief among them is that they are complicated and hard. Now there are three platforms that offer Automated Deep Learning (ADL) so simple that almost anyone can do it. There are several things holding back our use of deep learning methods and chief among them is that they are complicated and hard. A small percentage of our data science community has chosen the path of learning these new techniques, but it's a major departure both in problem type and technique from the predictive and prescriptive modeling that makes up 90% of what we get paid to do. Artificial intelligence, at least in the true sense of image, video, text, and speech recognition and processing is on everyone's lips but it's still hard to find a data scientist qualified to execute your project.
The same thing happens in vision, not just in humans but in animals' visual systems generally. Brains are made up of neurons which "fire" by emitting electrical signals to other neurons after being sufficiently "activated". These neurons are malleable in terms of how much a signal from other neurons will add to the activation level of the neuron (vaguely speaking, the weights connecting neurons to each other end up being trained to make the neural connections more useful, just like the parameters in a linear regression can be trained to improve the mapping from input to output). Our biological networks are arranged in a hierarchical manner, so that certain neurons end up detecting not extremely specific features of the world around us, but rather more abstract features, i.e. patterns or groupings of more low-level features. For example, the fusiform face area in the human visual system is specialized for facial recognition.
This tutorial takes you along the steps required to create a convolutional neural network (CNN/ConvNet) using TensorFlow and get it into production by allowing remote access via a HTTP-based application using Flask RESTful API. In this tutorial, a CNN is to be built using TensorFlow NN (tf.nn) module. The CNN model architecture is created and trained and tested against the CIFAR10 dataset. To make the model remotely accessible, a Flask Web application is created using Python to receive an uploaded image and return its classification label using HTTP. Anaconda3 is used in addition to TensorFlow on Windows with CPU support.