Structure Learning with Continuous Optimization: A Sober Look and Beyond

Ng, Ignavier, Huang, Biwei, Zhang, Kun

arXiv.org Artificial Intelligence 

Bayesian networks are a class of probabilistic graphical models that encode probabilistic distributions in a compact way (Pearl, 1988; Koller and Friedman, 2009). Recovery of their graphical structures from data, represented by directed acyclic graphs (DAGs), has found applications in several fields such as genetics (Peters et al., 2017) and education (Gong et al., 2022). This problem is NP-hard in general (Chickering, 1996; Chickering et al., 2004) owing to the combinatorial space of DAGs. Classical structure learning approaches fall into two broad categories, i.e., constraint-based methods and score-based methods. Constraint-based methods, such as PC (Spirtes and Glymour, 1991), employ conditional independence tests to estimate the skeleton and further perform edge orientation up to the Markov equivalence class (MEC) (Spirtes et al., 2001). Score-based methods typically assign a score to each structure and search for a high-scoring structure in the space of DAGs or equivalence classes (Koivisto and Sood, 2004; Singh and Moore, 2005; Cussens, 2011; Yuan and Malone, 2013). These methods often adopt greedy search because of the large space of possible structures (Chickering, 1996), such as GES (Chickering, 2002) and GDS (Peters and Bühlmann, 2013). Recently, Zheng et al. (2018) proposed a smooth characterization of acyclicity and transformed the structure learning problem of discrete nature into a continuous, nonconvex optimization problem, thus enabling the application of gradient-based methods.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found