Goto

Collaborating Authors

 Druce, Jeff


Brittle AI, Causal Confusion, and Bad Mental Models: Challenges and Successes in the XAI Program

arXiv.org Artificial Intelligence

The advances in artificial intelligence enabled by deep learning architectures are undeniable. In several cases, deep neural network driven models have surpassed human level performance in benchmark autonomy tasks. The underlying policies for these agents, however, are not easily interpretable. In fact, given their underlying deep models, it is impossible to directly understand the mapping from observations to actions for any reasonably complex agent. Producing this supporting technology to "open the black box" of these AI systems, while not sacrificing performance, was the fundamental goal of the DARPA XAI program. In our journey through this program, we have several "big picture" takeaways: 1) Explanations need to be highly tailored to their scenario; 2) many seemingly high performing RL agents are extremely brittle and are not amendable to explanation; 3) causal models allow for rich explanations, but how to present them isn't always straightforward; and 4) human subjects conjure fantastically wrong mental models for AIs, and these models are often hard to break. This paper discusses the origins of these takeaways, provides amplifying information, and suggestions for future work.


Explainable Artificial Intelligence (XAI) for Increasing User Trust in Deep Reinforcement Learning Driven Autonomous Systems

arXiv.org Artificial Intelligence

We consider the problem of providing users of deep Reinforcement Learning (RL) based systems with a better understanding of when their output can be trusted. We offer an explainable artificial intelligence (XAI) framework that provides a three-fold explanation: a graphical depiction of the systems generalization and performance in the current game state, how well the agent would play in semantically similar environments, and a narrative explanation of what the graphical information implies. We created a user-interface for our XAI framework and evaluated its efficacy via a human-user experiment. The results demonstrate a statistically significant increase in user trust and acceptance of the AI system with explanation, versus the AI system without explanation.


Causal Learning and Explanation of Deep Neural Networks via Autoencoded Activations

arXiv.org Machine Learning

Deep neural networks are complex and opaque. As they enter application in a variety of important and safety critical domains, users seek methods to explain their output predictions. We develop an approach to explaining deep neural networks by constructing causal models on salient concepts contained in a CNN. We develop methods to extract salient concepts throughout a target network by using autoencoders trained to extract human-understandable representations of network activations. We then build a bayesian causal model using these extracted concepts as variables in order to explain image classification. Finally, we use this causal model to identify and visualize features with significant causal influence on final classification.