Markov Equivalence and Consistency in Differentiable Structure Learning
Deng, Chang, Bello, Kevin, Ravikumar, Pradeep, Aragam, Bryon
Existing approaches to differentiable structure learning of directed acyclic graphs (DAGs) rely on strong identifiability assumptions in order to guarantee that global minimizers of the acyclicity-constrained optimization problem identifies the true DAG. Moreover, it has been observed empirically that the optimizer may exploit undesirable artifacts in the loss function. We explain and remedy these issues by studying the behaviour of differentiable acyclicity-constrained programs under general likelihoods with multiple global minimizers. By carefully regularizing the likelihood, it is possible to identify the sparsest model in the Markov equivalence class, even in the absence of an identifiable parametrization or even faithfulness. We first study the Gaussian case in detail, showing how proper regularization of the likelihood defines a score that identifies the sparsest model. These results are then generalized to general models and likelihoods, where the same claims hold. Furthermore, under standard faithfulness assumptions, our approach also recovers the Markov equivalence class. These theoretical results are validated empirically, showing how this can be done using standard gradient-based optimizers, thus paving the way for differentiable structure learning under general models and losses.
Nov-27-2024