'Explainable Artificial Intelligence': Cracking open the black box of AI

#artificialintelligence

At a demonstration of Amazon Web Services' new artificial intelligence image recognition tool last week, the deep learning analysis calculated with near certainty that a photo of speaker Glenn Gore depicted a potted plant. "It is very clever, it can do some amazing things but it needs a lot of hand holding still. AI is almost like a toddler. They can do some pretty cool things, sometimes they can cause a fair bit of trouble," said AWS' chief architect in his day two keynote at the company's summit in Sydney. Where the toddler analogy falls short, however, is that a parent can make a reasonable guess as to, say, what led to their child drawing all over the walls, and ask them why.


'Explainable Artificial Intelligence': Cracking open the black box of AI

#artificialintelligence

At a demonstration of Amazon Web Services' new artificial intelligence image recognition tool last week, the deep learning analysis calculated with near certainty that a photo of speaker Glenn Gore depicted a potted plant. "It is very clever, it can do some amazing things but it needs a lot of hand holding still. AI is almost like a toddler. They can do some pretty cool things, sometimes they can cause a fair bit of trouble," said AWS' chief architect in his day two keynote at the company's summit in Sydney. Where the toddler analogy falls short, however, is that a parent can make a reasonable guess as to, say, what led to their child drawing all over the walls, and ask them why.


The Next AI Milestone: Bridging the Semantic Gap – Intuition Machine – Medium

#artificialintelligence

John Launchbury of DARPA has an excellent video that I recommend everyone watch ( viewing just the slides will give one a wrong impression of the content). Statistical Learning -- Where programmers create statistical models for specific problem domains and train them on big data. Contextual Adaptation -- Where systems construct contextual explanatory models for classes of real world phenomena. It's a bit of a simplified presentation because it lumps all of machine learning, Bayesian methods and Deep Learning into a single category. There are many more approaches to AI that don't fit within DARPA's 3 waves.


Sparse Penalty in Deep Belief Networks: Using the Mixed Norm Constraint

arXiv.org Machine Learning

Deep Belief Networks (DBN) have been successfully applied on popular machine learning tasks. Specifically, when applied on hand-written digit recognition, DBNs have achieved approximate accuracy rates of 98.8%. In an effort to optimize the data representation achieved by the DBN and maximize their descriptive power, recent advances have focused on inducing sparse constraints at each layer of the DBN. In this paper we present a theoretical approach for sparse constraints in the DBN using the mixed norm for both non-overlapping and overlapping groups. We explore how these constraints affect the classification accuracy for digit recognition in three different datasets (MNIST, USPS, RIMES) and provide initial estimations of their usefulness by altering different parameters such as the group size and overlap percentage.


PROPS: Probabilistic personalization of black-box sequence models

arXiv.org Machine Learning

We present PROPS, a lightweight transfer learning mechanism for sequential data. PROPS learns probabilistic perturbations around the predictions of one or more arbitrarily complex, pre-trained black box models (such as recurrent neural networks). The technique pins the black-box prediction functions to "source nodes" of a hidden Markov model (HMM), and uses the remaining nodes as "perturbation nodes" for learning customized perturbations around those predictions. In this paper, we describe the PROPS model, provide an algorithm for online learning of its parameters, and demonstrate the consistency of this estimation. We also explore the utility of PROPS in the context of personalized language modeling. In particular, we construct a baseline language model by training a LSTM on the entire Wikipedia corpus of 2.5 million articles (around 6.6 billion words), and then use PROPS to provide lightweight customization into a personalized language model of President Donald J. Trump's tweeting. We achieved good customization after only 2,000 additional words, and find that the PROPS model, being fully probabilistic, provides insight into when President Trump's speech departs from generic patterns in the Wikipedia corpus. Python code (for both the PROPS training algorithm as well as experiment reproducibility) is available at https://github.com/cylance/perturbed-sequence-model.