Plotting

 Darwiche, Adnan


On Symbolically Encoding the Behavior of Random Forests

arXiv.org Artificial Intelligence

Recent work has shown that the input-output behavior of some machine learning systems can be captured symbolically using Boolean expressions or tractable Boolean circuits, which facilitates reasoning about the behavior of these systems. While most of the focus has been on systems with Boolean inputs and outputs, we address systems with discrete inputs and outputs, including ones with discretized continuous variables as in systems based on decision trees. We also focus on the suitability of encodings for computing prime implicants, which have recently played a central role in explaining the decisions of machine learning systems. We show some key distinctions with encodings for satisfiability, and propose an encoding that is sound and complete for the given task.


A New Perspective on Learning Context-Specific Independence

arXiv.org Artificial Intelligence

Local structure such as context-specific independence (CSI) has received much attention in the probabilistic graphical model (PGM) literature, as it facilitates the modeling of large complex systems, as well as for reasoning with them. In this paper, we provide a new perspective on how to learn CSIs from data. We propose to first learn a functional and parameterized representation of a conditional probability table (CPT), such as a neural network. Next, we quantize this continuous function, into an arithmetic circuit representation that facilitates efficient inference. In the first step, we can leverage the many powerful tools that have been developed in the machine learning literature. In the second step, we exploit more recently-developed analytic tools from explainable AI, for the purposes of learning CSIs. Finally, we contrast our approach, empirically and conceptually, with more traditional variable-splitting approaches, that search for CSIs more explicitly.


Decomposing Parameter Estimation Problems

Neural Information Processing Systems

We propose a technique for decomposing the parameter learning problem in Bayesian networks into independent learning problems. Our technique applies to incomplete datasets and exploits variables that are either hidden or observed in the given dataset. We show empirically that the proposed technique can lead to orders-of-magnitude savings in learning time. We explain, analytically and empirically, the reasons behind our reported savings, and compare the proposed technique to related ones that are sometimes used by inference algorithms. Papers published at the Neural Information Processing Systems Conference.


On the Relative Expressiveness of Bayesian and Neural Networks

arXiv.org Artificial Intelligence

Shortly after the field was born in the 1950s, the focus turned to symbolic, model-based approaches, which were premised on the need to represent and reason with domain knowledge, and exemplified by the use of logic to represent such knowledge (McCarthy, 1959). In the 1980s, the focus turned to probabilistic, model-based approaches, as exemplified by Bayesian networks and probabilistic graphical models more generally (first major milestone) (Pearl, 1988). Starting in the 1990s, and as data became abundant, these probabilistic models provided the foundation for much of the research in machine learning, where models were learned either generatively or discriminatively from data. Recently, the field shifted its focus to numeric, functionbased approaches, as exemplified by neural networks, which are trained discriminatively using labeled data (deep learning, second major milestone) (Goodfellow et al., 2016; Hinton et al., 2006; Bengio et al., 2006; Ranzato et al., 2006; Rosenblatt, 1958; McCulloch & Pitts, 1943). Perhaps the biggest surprise with the second milestone is the extent to which certain tasks, associated with perception or limited forms of cognition, can be approximated using functions (i.e., neural networks) that are learned purely from labeled data, without the need for modeling or reasoning (Darwiche, 2018).


An Exhaustive DPLL Algorithm for Model Counting

Journal of Artificial Intelligence Research

State-of-the-art model counters are based on exhaustive DPLL algorithms, and have been successfully used in probabilistic reasoning, one of the key problems in AI. In this article, we present a new exhaustive DPLL algorithm with a formal semantics, a proof of correctness, and a modular design. The modular design is based on the separation of the core model counting algorithm from SAT solving techniques. We also show that the trace of our algorithm belongs to the language of Sentential Decision Diagrams (SDDs), which is a subset of Decision-DNNFs, the trace of existing state-of-the-art model counters. Still, our experimental analysis shows comparable results against state-of-the-art model counters. Furthermore, we obtain the first top-down SDD compiler, and show orders-of-magnitude improvements in SDD construction time against the existing bottom-up SDD compiler.


A Symbolic Approach to Explaining Bayesian Network Classifiers

arXiv.org Artificial Intelligence

We propose an approach for explaining Bayesian network classifiers, which is based on compiling such classifiers into decision functions that have a tractable and symbolic form. We introduce two types of explanations for why a classifier may have classified an instance positively or negatively and suggest algorithms for computing these explanations. The first type of explanation identifies a minimal set of the currently active features that is responsible for the current classification, while the second type of explanation identifies a minimal set of features whose current state (active or not) is sufficient for the classification. We consider in particular the compilation of Naive and Latent-Tree Bayesian network classifiers into Ordered Decision Diagrams (ODDs), providing a context for evaluating our proposal using case studies and experiments based on classifiers from the literature.


Conditional PSDDs: Modeling and Learning With Modular Knowledge

AAAI Conferences

Probabilistic Sentential Decision Diagrams (PSDDs) have been proposed for learning tractable probability distributions from a combination of data and background knowledge (in the form of Boolean constraints). In this paper, we propose a variant on PSDDs, called conditional PSDDs, for representing a family of distributions that are conditioned on the same set of variables. Conditional PSDDs can also be learned from a combination of data and (modular) background knowledge. We use conditional PSDDs to define a more structured version of Bayesian networks, in which nodes can have an exponential number of states, hence expanding the scope of domains where Bayesian networks can be applied. Compared to classical PSDDs, the new representation exploits the independencies captured by a Bayesian network to decompose the learning process into localized learning tasks, which enables the learning of better models while using less computation. We illustrate the promise of conditional PSDDs and structured Bayesian networks empirically, and by providing a case study to the modeling of distributions over routes on a map.


Tractability in Structured Probability Spaces

Neural Information Processing Systems

Recently, the Probabilistic Sentential Decision Diagram (PSDD) has been proposed as a framework for systematically inducing and learning distributions over structured objects, including combinatorial objects such as permutations and rankings, paths and matchings on a graph, etc. In this paper, we study the scalability of such models in the context of representing and learning distributions over routes on a map. In particular, we introduce the notion of a hierarchical route distribution and show how they can be leveraged to construct tractable PSDDs over route distributions, allowing them to scale to larger maps. We illustrate the utility of our model empirically, in a route prediction task, showing how accuracy can be increased significantly compared to Markov models.


Human-Level Intelligence or Animal-Like Abilities?

arXiv.org Machine Learning

The vision systems of the eagle and the snake outperform everything that we can make in the laboratory, but snakes and eagles cannot build an eyeglass or a telescope or a microscope. (Judea Pearl)


Learning Bayesian networks with ancestral constraints

Neural Information Processing Systems

We consider the problem of learning Bayesian networks optimally, when subject to background knowledge in the form of ancestral constraints. Our approach is based on a recently proposed framework for optimal structure learning based on non-decomposable scores, which is general enough to accommodate ancestral constraints. The proposed framework exploits oracles for learning structures using decomposable scores, which cannot accommodate ancestral constraints since they are non-decomposable. We show how to empower these oracles by passing them decomposable constraints that they can handle, which are inferred from ancestral constraints that they cannot handle. Empirically, we demonstrate that our approach can be orders-of-magnitude more efficient than alternative frameworks, such as those based on integer linear programming.