Reviews: Learning Treewidth-Bounded Bayesian Networks with Thousands of Variables

Neural Information Processing Systems 

The proposed method is very similar to previous work by Nie et al. -- both use k-trees to search for low-treewidth Bayesian networks, both start with a randomly chosen initial clique, and both propose using an A* method for finding the best tree. The differences are that Nie et al. score k-trees using a mutual information score and use BDeu for choosing the final consistent Bayesian network, while this paper proposes using BIC and incrementally building the Bayesian network along with the k-tree, using the BN to score the k-tree. This paper also includes the additional restriction that the complete variable (partial) order is chosen randomly, while in Nie et al. The main justification for these differences is the ability to scale to large treewidths. However, in the experiments, the previous S2 algorithm also can scale to large treewidths.