A Algorithms

Neural Information Processing Systems 

" j for k: " 2 to n do x The result follows directly from Theorem 1 in Cranko et al. [2021]: sup Lemma 7. If Assumption 3 holds, for any " 1 is an eigenvector of H Similarly, applying Hoeffding's inequality and the Kantorovich-Rubinstein theorem gives us Probp E Theorem 9. Given a Bayesian network We prove the statements in this theorem in several steps. In order to prove (a) and (b), we will show that the DRO problem is strictly convex if true non-neighbors are known so that there is an optimal solution. We would like to show that the solution to Equation (4) with true non-neighbor constraints is optimal. In this way, we do not recover any non-neighbor nodes in the skeleton. We follow the proof of Lemma 11.2 in Hastie et al. [2015]. Until now, we have proven properties (a) and (b). In this way, we are able to recover all the neighbor nodes with a threshold β {2 . Now we are ready to prove (d). BIC is not applicable to skeletons. The best and runner-up results are marked in bold. Significant differences are marked by: (paired t-test, p ă 0. 05). The final sample complexity becomes m " O p C p ε