Dhami, Devendra Singh
Causal Explanations of Structural Causal Models
Zečević, Matej, Dhami, Devendra Singh, Rothkopf, Constantin A., Kersting, Kristian
In explanatory interactive learning (XIL) the user queries the learner, then the learner explains its answer to the user and finally the loop repeats. XIL is attractive for two reasons, (1) the learner becomes better and (2) the user's trust increases. For both reasons to hold, the learner's explanations must be useful to the user and the user must be allowed to ask useful questions. Ideally, both questions and explanations should be grounded in a causal model since they avoid spurious fallacies. Ultimately, we seem to seek a causal variant of XIL. The question part on the user's end we believe to be solved since the user's mental model can provide the causal model. But how would the learner provide causal explanations? In this work we show that existing explanation methods are not guaranteed to be causal even when provided with a Structural Causal Model (SCM). Specifically, we use the popular, proclaimed causal explanation method CXPlain to illustrate how the generated explanations leave open the question of truly causal explanations. Thus as a step towards causal XIL, we propose a solution to the lack of causal explanations. We solve this problem by deriving from first principles an explanation method that makes full use of a given SCM, which we refer to as SC$\textbf{E}$ ($\textbf{E}$ standing for explanation). Since SCEs make use of structural information, any causal graph learner can now provide human-readable explanations. We conduct several experiments including a user study with 22 participants to investigate the virtue of SCE as causal explanations of SCMs.
On the Tractability of Neural Causal Inference
Zečević, Matej, Dhami, Devendra Singh, Kersting, Kristian
Roth (1996) proved that any form of marginal inference with probabilistic graphical models (e.g. Bayesian Networks) will at least be NP-hard. Introduced and extensively investigated in the past decade, the neural probabilistic circuits known as sum-product network (SPN) offers linear time complexity. On another note, research around neural causal models (NCM) recently gained traction, demanding a tighter integration of causality for machine learning. To this end, we present a theoretical investigation of if, when, how and under what cost tractability occurs for different NCM. We prove that SPN-based causal inference is generally tractable, opposed to standard MLP-based NCM. We further introduce a new tractable NCM-class that is efficient in inference and fully expressive in terms of Pearl's Causal Hierarchy. Our comparative empirical illustration on simulations and standard benchmarks validates our theoretical proofs.
The Causal Loss: Driving Correlation to Imply Causation
Willig, Moritz, Zečević, Matej, Dhami, Devendra Singh, Kersting, Kristian
Most algorithms in classical and contemporary machine learning focus on correlation-based dependence between features to drive performance. Although success has been observed in many relevant problems, these algorithms fail when the underlying causality is inconsistent with the assumed relations. We propose a novel model-agnostic loss function called Causal Loss that improves the interventional quality of the prediction using an intervened neural-causal regularizer. In support of our theoretical results, our experimental illustration shows how causal loss bestows a non-causal associative model (like a standard neural net or decision tree) with interventional capabilities.
Neuro-Symbolic Forward Reasoning
Shindo, Hikaru, Dhami, Devendra Singh, Kersting, Kristian
Reasoning is an essential part of human intelligence and thus has been a long-standing goal in artificial intelligence research. With the recent success of deep learning, incorporating reasoning with deep learning systems, i.e., neuro-symbolic AI has become a major field of interest. We propose the Neuro-Symbolic Forward Reasoner (NSFR), a new approach for reasoning tasks taking advantage of differentiable forward-chaining using first-order logic. The key idea is to combine differentiable forward-chaining reasoning with object-centric (deep) learning. Differentiable forward-chaining reasoning computes logical entailments smoothly, i.e., it deduces new facts from given facts and rules in a differentiable manner. The object-centric learning approach factorizes raw inputs into representations in terms of objects. Thus, it allows us to provide a consistent framework to perform the forward-chaining inference from raw inputs. NSFR factorizes the raw inputs into the object-centric representations, converts them into probabilistic ground atoms, and finally performs differentiable forward-chaining inference using weighted rules for inference. Our comprehensive experimental evaluations on object-centric reasoning data sets, 2D Kandinsky patterns and 3D CLEVR-Hans, and a variety of tasks show the effectiveness and advantage of our approach.
SLASH: Embracing Probabilistic Circuits into Neural Answer Set Programming
Skryagin, Arseny, Stammer, Wolfgang, Ochs, Daniel, Dhami, Devendra Singh, Kersting, Kristian
The goal of combining the robustness of neural networks and the expressivity of symbolic methods has rekindled the interest in neuro-symbolic AI. Recent advancements in neuro-symbolic AI often consider specifically-tailored architectures consisting of disjoint neural and symbolic components, and thus do not exhibit desired gains that can be achieved by integrating them into a unifying framework. We introduce SLASH -- a novel deep probabilistic programming language (DPPL). At its core, SLASH consists of Neural-Probabilistic Predicates (NPPs) and logical programs which are united via answer set programming. The probability estimates resulting from NPPs act as the binding element between the logical program and raw input data, thereby allowing SLASH to answer task-dependent logical queries. This allows SLASH to elegantly integrate the symbolic and neural components in a unified framework. We evaluate SLASH on the benchmark data of MNIST addition as well as novel tasks for DPPLs such as missing data prediction and set prediction with state-of-the-art performance, thereby showing the effectiveness and generality of our method.
Relating Graph Neural Networks to Structural Causal Models
Zečević, Matej, Dhami, Devendra Singh, Veličković, Petar, Kersting, Kristian
Understanding causal interactions is central to human cognition The SCM implies a graph structure over its modelled variables, and thereby of high value to science, engineering, business, and since GNN work on graphs, a closer inspection and law (Penn and Povinelli 2007). Developmental on the relation between the two models seems reasonable psychology has shown how children explore similar to the towards progressing research in neural-causal AI. Instead of manner of scientist, all by asking "What if?" and "Why?" taking inspiration from causality's principles for improving type of questions (Gopnik 2012; Buchsbaum et al. 2012; machine learning (Mitrovic et al. 2020), we instead show Pearl and Mackenzie 2018), while artificial intelligence research how GNN can be used to perform causal computations i.e., dreams of automating the scientist's manner (Mc-how causality can emerge within neural models. To be more Carthy 1998; McCarthy and Hayes 1981; Steinruecken et al. precise on the term causal inference: we refer to the modelling 2019). Deep learning has brought optimizable universality of Pearl's Causal Hierarchy (PCH) (Bareinboim et al. in approximation which refers to the fact that for any function 2020). That is, we are given partial knowledge on the SCM there will exist a neural network that is close in approximation in the form of e.g. the (partial) causal graph and/or data to arbitrary precision (Cybenko 1989; Hornik from the different levels of the hierarchy.
Non-Parametric Learning of Gaifman Models
Dhami, Devendra Singh, Yen, Siwen, Kunapuli, Gautam, Natarajan, Sriraam
We consider the problem of structure learning for Gaifman models and learn relational features that can be used to derive feature representations from a knowledge base. These relational features are first-order rules that are then partially grounded and counted over local neighborhoods of a Gaifman model to obtain the feature representations. We propose a method for learning these relational features for a Gaifman model by using relational tree distances. Our empirical evaluation on real data sets demonstrates the superiority of our approach over classical rule-learning.
Knowledge-augmented Column Networks: Guiding Deep Learning with Advice
Das, Mayukh, Dhami, Devendra Singh, Yu, Yang, Kunapuli, Gautam, Natarajan, Sriraam
Recently, deep models have had considerable success in several tasks, especially with low-level representations. However, effective learning from sparse noisy samples is a major challenge in most deep models, especially in domains with structured representations. Inspired by the proven success of human guided machine learning, we propose Knowledge-augmented Column Networks, a relational deep learning framework that leverages human advice/knowledge to learn better models in presence of sparsity and systematic noise.
Human-Guided Learning of Column Networks: Augmenting Deep Learning with Advice
Das, Mayukh, Yu, Yang, Dhami, Devendra Singh, Kunapuli, Gautam, Natarajan, Sriraam
Recently, deep models have been successfully applied in several applications, especially with low-level representations. However, sparse, noisy samples and structured domains (with multiple objects and interactions) are some of the open challenges in most deep models. Column Networks, a deep architecture, can succinctly capture such domain structure and interactions, but may still be prone to sub-optimal learning from sparse and noisy samples. Inspired by the success of human-advice guided learning in AI, especially in data-scarce domains, we propose Knowledge-augmented Column Networks that leverage human advice/knowledge for better learning with noisy/sparse samples. Our experiments demonstrate that our approach leads to either superior overall performance or faster convergence (i.e., both effective and efficient).
Knowledge-Based Morphological Classification of Galaxies from Vision Features
Dhami, Devendra Singh (Indiana University Bloomington) | Leake, David (Indiana University Bloomington) | Natarajan, Sriraam (Indiana University Bloomington)
This paper presents a knowledge-based approach to the task of learning and identifying galaxies from their images. To this effect, we propose a crowd-sourced pipeline approach that employs two systems - case based and rule based systems. First, the approach extracts morphological features i.e. features describing the structure of the galaxy such as its shape, central characteristics e.g., has a bar or bulge at its center)etc., using computer vision techniques. Then it employs a case based reasoning system and a rule based system to perform the classification task. Our initial results show that this pipeline is effective in learning reasonably accurate models on this complex task.