Companies are increasingly using machine learning models to make decisions, such as the allocation of jobs, loans, or university admissions, that directly or indirectly affect people's lives. Algorithms are also used to recommend a movie to watch, a person to date, or an apartment to rent. When talking to business customers – the operators of machine learning (ML) – I hear growing demand to understand how these models and algorithms work, especially when there is an expanding number of machine learning cases without humans in the loop. Imagine an ML model is recommending the top 10 candidates from 100 applicants for a job post. Before trusting the model's recommendation, the recruiter wants to check the results.
There has recently been a surge of work in explanatory artificial intelligence (XAI). This research area tackles the important problem that complex machines and algorithms often cannot provide insights into their behavior and thought processes. XAI allows users and parts of the internal system to be more transparent, providing explanations of their decisions in some level of detail. These explanations are important to ensure algorithmic fairness, identify potential bias/problems in the training data, and to ensure that the algorithms perform as expected. However, explanations produced by these systems is neither standardized nor systematically assessed. In an effort to create best practices and identify open challenges, we provide our definition of explainability and show how it can be used to classify existing literature. We discuss why current approaches to explanatory methods especially for deep neural networks are insufficient. Finally, based on our survey, we conclude with suggested future research directions for explanatory artificial intelligence.
Nowadays, deep neural networks are widely used in mission critical systems such as healthcare, self-driving vehicles, and military which have direct impact on human lives. However, the black-box nature of deep neural networks challenges its use in mission critical applications, raising ethical and judicial concerns inducing lack of trust. Explainable Artificial Intelligence (XAI) is a field of Artificial Intelligence (AI) that promotes a set of tools, techniques, and algorithms that can generate high-quality interpretable, intuitive, human-understandable explanations of AI decisions. In addition to providing a holistic view of the current XAI landscape in deep learning, this paper provides mathematical summaries of seminal work. We start by proposing a taxonomy and categorizing the XAI techniques based on their scope of explanations, methodology behind the algorithms, and explanation level or usage which helps build trustworthy, interpretable, and self-explanatory deep learning models. We then describe the main principles used in XAI research and present the historical timeline for landmark studies in XAI from 2007 to 2020. After explaining each category of algorithms and approaches in detail, we then evaluate the explanation maps generated by eight XAI algorithms on image data, discuss the limitations of this approach, and provide potential future directions to improve XAI evaluation.
In 2017, a Palestinian construction worker in the West Bank settlement of Beiter Illit, Jerusalem, posted a picture of himself on Facebook in which he was leaning against a bulldozer. Shortly after, Israeli police arrested him on suspicions that he was planning an attack, because the caption of his post read "attack them." The real caption of the post was "good morning" in Arabic. But for some unknown reason, Facebook's artificial intelligence-powered translation service translated the text to "hurt them" in English or "attack them" in Hebrew. The Israeli Defense Force uses Facebook's automated translation to monitor the accounts of Palestinian users for possible threats. In this case, they trusted Facebook's AI enough not to have the post checked by an Arabic-speaking officer before making the arrest. The Palestinian worker was eventually released after the mistake came to light--but not before he underwent hours of questioning.
The irony is not lost on Kate Saenko. Now that humans have programmed computers to learn, they want to know exactly what the computers have learned, and how they make decisions after their learning process is complete. To do that, Saenko, a Boston University College of Arts & Sciences associate professor of computer science, used humans--asking them to look at dozens of pictures depicting steps that the computer may have taken on its road to a decision, and identify its most likely path. The humans gave Saenko answers that made sense, but there was a problem: they made sense to humans, and humans, Saenko knew, have biases. In fact, humans don't even understand how they themselves make decisions.