Goto

Collaborating Authors

 bayesian structure learning


Bayesian Structure Learning by Recursive Bootstrap

Neural Information Processing Systems

We address the problem of Bayesian structure learning for domains with hundreds of variables by employing non-parametric bootstrap, recursively. We propose a method that covers both model averaging and model selection in the same framework. The proposed method deals with the main weakness of constraint-based learning---sensitivity to errors in the independence tests---by a novel way of combining bootstrap with constraint-based learning. Essentially, we provide an algorithm for learning a tree, in which each node represents a scored CPDAG for a subset of variables and the level of the node corresponds to the maximal order of conditional independencies that are encoded in the graph. As higher order independencies are tested in deeper recursive calls, they benefit from more bootstrap samples, and therefore are more resistant to the curse-of-dimensionality. Moreover, the re-use of stable low order independencies allows greater computational efficiency. We also provide an algorithm for sampling CPDAGs efficiently from their posterior given the learned tree. That is, not from the full posterior, but from a reduced space of CPDAGs encoded in the learned tree. We empirically demonstrate that the proposed algorithm scales well to hundreds of variables, and learns better MAP models and more reliable causal relationships between variables, than other state-of-the-art-methods.


Reviews: Modeling Uncertainty by Learning a Hierarchy of Deep Neural Connections

Neural Information Processing Systems

This paper proposes BRAINet as to combine Bayesian structure learning and Bayesian neural networks. In detail, the method assumes a confounder on the input features X and the discriminative network parameters \phi, where this confounder is defined as the generative graph structure on X, and the discriminative network shares the same structure as the generative one. Given observations X and Y, the approach first sample the generative graph structure from the posterior given X, then train the parameters of the corresponding discriminative network in order to fit the posterior distribution of phi given X and Y. Experiments are performed on calibration and OOD tasks, with MC-dropout and deep Ensembles as the main comparing baselines. Reviewers include experts in Bayesian structure learning and Bayesian neural networks. They read the author feedback carefully and engaged in post-rebuttal discussion actively.


Reviews: Bayesian Structure Learning by Recursive Bootstrap

Neural Information Processing Systems

This work expands on the algorithm RAI by Yehezkel and Lerner for constraint-based structure learning of Bayesian networks. Each node of the tree splits the variables into subsets (one descendant and K ancestral subsets) by using conditional independence (CI) tests of order n. The submission proposes the B-RAI algorithm that leverages bootstrap to allow the algorithm to output a set of highly likely CPDAG rather than the MAP one. Bootstrap is not naively leveraged. Instead, it is integrated in the recursive call of the algorithm.


Estimation of Distribution Algorithms with Matrix Transpose in Bayesian Learning

arXiv.org Machine Learning

Estimation of distribution algorithms (EDAs) constitute a new branch of evolutionary optimization algorithms, providing effective and efficient optimization performance in a variety of research areas. Recent studies have proposed new EDAs that employ mutation operators in standard EDAs to increase the population diversity. We present a new mutation operator, a matrix transpose, specifically designed for Bayesian structure learning, and we evaluate its performance in Bayesian structure learning. The results indicate that EDAs with transpose mutation give markedly better performance than conventional EDAs. Introduction Estimation of distribution algorithms (EDAs) constitute a new branch of evolutionary optimization algorithms [1]; their workflow is similar to that of conventional GAs.


A Step-by-Step Guide in detecting causal relationships using Bayesian Structure Learning in Python.

#artificialintelligence

The use of machine learning techniques has become a standard toolkit to obtain useful insights and make predictions in many areas such as disease prediction, recommendation systems, natural language processing. Although good performances can be achieved, it is not straightforward to extract causal relationships with, for example, the target variable. In other words: which variables have a direct causal effect on the target variable? Such insights are important to determine the driving factors that reach the conclusion, and as such, strategic actions can be taken. A branch of machine learning is Bayesian probabilistic graphical models, also named Bayesian networks (BN), which can be used to determine such causal factors. Let's rehash some terminology before we jump into the technical details of causal models.


Bayesian structure learning and sampling of Bayesian networks with the R package BiDAG

arXiv.org Machine Learning

A Bayesian network is a probabilistic graphical model, which represents conditional independence relationships between a set of random variables by a directed acyclic graph (DAG).The problem of DAG learning from observational data is hard (Chickering 1996), and the number of DAGs grows super-exponentially with the number of nodes. Hence, developing and implementing methods to learn an underlying DAG from observational data in reasonable time continues to be the focus of much research (Bartlett and Cussens 2017; Goudie and Mukherjee 2016; Scanagatta, de Campos, and Corani 2015). Drton and Maathuis (2017) provide an overview of the approaches for structure learning of graphical models including Bayesian networks. The R (R Development Core Team 2008) packages pcalg (Kalisch, Mächler, Colombo, Maathuis, and Bühlmann 2012), BNlearn (Scutari 2010), bnstruct (Franzin, Sambo, and Camillo 2017) and the Java-based toolbox TETRAD (Glymour, Scheines, Spirtes, and Ramsey 2017) implement multiple approaches to structure learning, including both constraint-based and searcharXiv:2105.00488v1


HNet: Graphical Hypergeometric Networks

arXiv.org Machine Learning

Motivation: Real-world data often contain measurements with both continuous and discrete values. Despite the availability of many libraries, data sets with mixed data types require intensive pre-processing steps, and it remains a challenge to describe the relationships between variables. The data understanding phase is an important step in the data mining process, however, without making any assumptions on the data, the search space is super-exponential in the number of variables. Methods: We propose graphical hypergeometric networks (HNet), a method to test associations across variables for significance using statistical inference. The aim is to determine a network using only the significant associations in order to shed light on the complex relationships across variables. HNet processes raw unstructured data sets and outputs a network that consists of (partially) directed or undirected edges between the nodes (i.e., variables). To evaluate the accuracy of HNet, we used well known data sets and in addition generated data sets with known ground truth. The performance of HNet is compared to Bayesian structure learning. Results: We demonstrate that HNet showed high accuracy and performance in the detection of node links. In the case of the Alarm data set we can demonstrate on average an MCC score of 0.33 + 0.0002 (P<1x10-6), whereas Bayesian structure learning resulted in an average MCC score of 0.52 + 0.006 (P<1x10-11), and randomly assigning edges resulted in a MCC score of 0.004 + 0.0003 (P=0.49). Conclusions: HNet can process raw unstructured data sets, allows analysis of mixed data types, it easily scales up in number of variables, and allows detailed examination of the detected associations. Availability: https://erdogant.github.io/hnet/


Bayesian Structure Learning by Recursive Bootstrap

Neural Information Processing Systems

We address the problem of Bayesian structure learning for domains with hundreds of variables by employing non-parametric bootstrap, recursively. We propose a method that covers both model averaging and model selection in the same framework. The proposed method deals with the main weakness of constraint-based learning---sensitivity to errors in the independence tests---by a novel way of combining bootstrap with constraint-based learning. Essentially, we provide an algorithm for learning a tree, in which each node represents a scored CPDAG for a subset of variables and the level of the node corresponds to the maximal order of conditional independencies that are encoded in the graph. As higher order independencies are tested in deeper recursive calls, they benefit from more bootstrap samples, and therefore are more resistant to the curse-of-dimensionality. Moreover, the re-use of stable low order independencies allows greater computational efficiency.


Algorithms and Complexity Results for Exact Bayesian Structure Learning

arXiv.org Machine Learning

Bayesian structure learning is the NP-hard problem of discovering a Bayesian network that optimally represents a given set of training data. In this paper we study the computational worst-case complexity of exact Bayesian structure learning under graph theoretic restrictions on the super-structure. The super-structure (a concept introduced by Perrier, Imoto, and Miyano, JMLR 2008) is an undirected graph that contains as subgraphs the skeletons of solution networks. Our results apply to several variants of score-based Bayesian structure learning where the score of a network decomposes into local scores of its nodes. Results: We show that exact Bayesian structure learning can be carried out in non-uniform polynomial time if the super-structure has bounded treewidth and in linear time if in addition the super-structure has bounded maximum degree. We complement this with a number of hardness results. We show that both restrictions (treewidth and degree) are essential and cannot be dropped without loosing uniform polynomial time tractability (subject to a complexity-theoretic assumption). Furthermore, we show that the restrictions remain essential if we do not search for a globally optimal network but we aim to improve a given network by means of at most k arc additions, arc deletions, or arc reversals (k-neighborhood local search).