Goto

Collaborating Authors

 acyclicity


Large-ScaleDifferentiable CausalDiscoveryofFactorGraphs

Neural Information Processing Systems

A common theme in causal inference is learning causal relationships between observed variables, also known as causal discovery. This is usually a daunting task, given the large number of candidate causal graphs and the combinatorial nature of the search space.



Efficient Causal Structure Learning via Modular Subgraph Integration

Sun, Haixiang, Tian, Pengchao, Zhou, Zihan, Zhang, Jielei, Li, Peiyi, Liu, Andrew L.

arXiv.org Machine Learning

Learning causal structures from observational data remains a fundamental yet computationally intensive task, particularly in high-dimensional settings where existing methods face challenges such as the super-exponential growth of the search space and increasing computational demands. To address this, we introduce VISTA (Voting-based Integration of Subgraph Topologies for Acyclicity), a modular framework that decomposes the global causal structure learning problem into local subgraphs based on Markov Blankets. The global integration is achieved through a weighted voting mechanism that penalizes low-support edges via exponential decay, filters unreliable ones with an adaptive threshold, and ensures acyclicity using a Feedback Arc Set (FAS) algorithm. The framework is model-agnostic, imposing no assumptions on the inductive biases of base learners, is compatible with arbitrary data settings without requiring specific structural forms, and fully supports parallelization. We also theoretically establish finite-sample error bounds for VISTA, and prove its asymptotic consistency under mild conditions. Extensive experiments on both synthetic and real datasets consistently demonstrate the effectiveness of VISTA, yielding notable improvements in both accuracy and efficiency over a wide range of base learners.


DAG Learning from Zero-Inflated Count Data Using Continuous Optimization

Sato, Noriaki, Scutari, Marco, Kawano, Shuichi, Yamaguchi, Rui, Imoto, Seiya

arXiv.org Machine Learning

We address network structure learning from zero-inflated count data by casting each node as a zero-inflated generalized linear model and optimizing a smooth, score-based objective under a directed acyclic graph constraint. Our Zero-Inflated Continuous Optimization (ZICO) approach uses node-wise likelihoods with canonical links and enforces acyclicity through a differentiable surrogate constraint combined with sparsity regularization. ZICO achieves superior performance with faster runtimes on simulated data. It also performs comparably to or better than common algorithms for reverse engineering gene regulatory networks. ZICO is fully vectorized and mini-batched, enabling learning on larger variable sets with practical runtimes in a wide range of domains.


DAG DECORation: Continuous Optimization for Structure Learning under Hidden Confounding

Pal, Samhita, O'quinn, James, Aryan, Kaveh, Pua, Heather, Long, James P., Asiaee, Amir

arXiv.org Artificial Intelligence

We study structure learning for linear Gaussian SEMs in the presence of latent confounding. Existing continuous methods excel when errors are independent, while deconfounding-first pipelines rely on pervasive factor structure or nonlinearity. We propose \textsc{DECOR}, a single likelihood-based and fully differentiable estimator that jointly learns a DAG and a correlated noise model. Our theory gives simple sufficient conditions for global parameter identifiability: if the mixed graph is bow free and the noise covariance has a uniform eigenvalue margin, then the map from $(\B,\OmegaMat)$ to the observational covariance is injective, so both the directed structure and the noise are uniquely determined. The estimator alternates a smooth-acyclic graph update with a convex noise update and can include a light bow complementarity penalty or a post hoc reconciliation step. On synthetic benchmarks that vary confounding density, graph density, latent rank, and dimension with $n




e205ee2a5de471a70c1fd1b46033a75f-AuthorFeedback.pdf

Neural Information Processing Systems

We thank all the reviewers for their insightful comments! Regarding Theorem 1, yes, global indices are only needed in ordered cases. We will try to improve the title. This also indicates that our model learns a more compact latent space. We will add this possible explanation in the revised version.


Reviews: DAGs with NO TEARS: Continuous Optimization for Structure Learning

Neural Information Processing Systems

The authors study the problem of structure learning for Bayesian networks. The conventional methods for this task include the constraint-based methods or the score-based methods which involve optimizing a discrete score function over the set of DAGs with a combinatorial constraint. Unlike the existing approaches, the authors propose formulating the problem as a continuous optimization problem over real matrices, which performs a global search, and can be solved using standard numerical algorithms. The main idea in this work is using a smooth function for expressing an equality constraint to force acyclicity on the estimated structure. The paper is very well written and enjoyable to read.


Federated Causality Learning with Explainable Adaptive Optimization

Yang, Dezhi, He, Xintong, Wang, Jun, Yu, Guoxian, Domeniconi, Carlotta, Zhang, Jinglin

arXiv.org Artificial Intelligence

Discovering the causality from observational data is a crucial task in various scientific domains. With increasing awareness of privacy, data are not allowed to be exposed, and it is very hard to learn causal graphs from dispersed data, since these data may have different distributions. In this paper, we propose a federated causal discovery strategy (FedCausal) to learn the unified global causal graph from decentralized heterogeneous data. We design a global optimization formula to naturally aggregate the causal graphs from client data and constrain the acyclicity of the global graph without exposing local data. Unlike other federated causal learning algorithms, FedCausal unifies the local and global optimizations into a complete directed acyclic graph (DAG) learning process with a flexible optimization objective. We prove that this optimization objective has a high interpretability and can adaptively handle homogeneous and heterogeneous data. Experimental results on synthetic and real datasets show that FedCausal can effectively deal with non-independently and identically distributed (non-iid) data and has a superior performance.