AITopics | causal dag

Collaborating Authors

causal dag

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

045c87def0c02e3ad0d3d849766d7f1e-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 08:16:32 GMT

artificial intelligence, experiment, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Active Structure Learning of Causal DAGs via Directed Clique Trees

Neural Information Processing SystemsDec-24-2025, 21:34:48 GMT

A growing body of work has begun to study intervention design for efficient structure learning of causal directed acyclic graphs (DAGs). A typical setting is a \emph{causally sufficient} setting, i.e. a system with no latent confounders, selection bias, or feedback, when the essential graph of the observational equivalence class (EC) is given as an input and interventions are assumed to be noiseless. Most existing works focus on \textit{worst-case} or \textit{average-case} lower bounds for the number of interventions required to orient a DAG. These worst-case lower bounds only establish that the largest clique in the essential graph \textit{could} make it difficult to learn the true DAG. In this work, we develop a \textit{universal} lower bound for single-node interventions that establishes that the largest clique is \textit{always} a fundamental impediment to structure learning. Specifically, we present a decomposition of a DAG into independently orientable components through \emph{directed clique trees} and use it to prove that the number of single-node interventions necessary to orient any DAG in an EC is at least the sum of half the size of the largest cliques in each chain component of the essential graph. Moreover, we present a two-phase intervention design algorithm that, under certain conditions on the chordal skeleton, matches the optimal number of interventions up to a multiplicative logarithmic factor in the number of maximal cliques. We show via synthetic experiments that our algorithm can scale to much larger graphs than most of the related work and achieves better worst-case performance than other scalable approaches.

active structure learning, intervention, name change, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.63)

Add feedback

Towards Scalable Bayesian Learning of Causal DAGs

Neural Information Processing SystemsDec-24-2025, 00:28:14 GMT

We give methods for Bayesian inference of directed acyclic graphs, DAGs, and the induced causal effects from passively observed complete data. Our methods build on a recent Markov chain Monte Carlo scheme for learning Bayesian networks, which enables efficient approximate sampling from the graph posterior, provided that each node is assigned a small number K of candidate parents. We present algorithmic techniques to significantly reduce the space and time requirements, which make the use of substantially larger values of K feasible. Furthermore, we investigate the problem of selecting the candidate parents per node so as to maximize the covered posterior mass. Finally, we combine our sampling method with a novel Bayesian approach for estimating causal effects in linear Gaussian DAG models. Numerical experiments demonstrate the performance of our methods in detecting ancestor-descendant relations, and in causal effect estimation our Bayesian method is shown to outperform previous approaches.

name change, proceedings, scalable bayesian learning, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Mitigating Hallucinations in Large Language Models via Causal Reasoning

Li, Yuangang, Shen, Yiqing, Nian, Yi, Gao, Jiechao, Wang, Ziyi, Yu, Chenxiao, Li, Shawn, Wang, Jie, Hu, Xiyang, Zhao, Yue

arXiv.org Artificial IntelligenceNov-13-2025

Large language models (LLMs) exhibit logically inconsistent hallucinations that appear coherent yet violate reasoning principles, with recent research suggesting an inverse relationship between causal reasoning capabilities and such hallucinations. However, existing reasoning approaches in LLMs, such as Chain-of-Thought (CoT) and its graph-based variants, operate at the linguistic token level rather than modeling the underlying causal relationships between variables, lacking the ability to represent conditional independencies or satisfy causal identification assumptions. To bridge this gap, we introduce causal-DAG construction and reasoning (CDCR-SFT), a supervised fine-tuning framework that trains LLMs to explicitly construct variable-level directed acyclic graph (DAG) and then perform reasoning over it. Moreover, we present a dataset comprising 25,368 samples (CausalDR), where each sample includes an input question, explicit causal DAG, graph-based reasoning trace, and validated answer. Experiments on four LLMs across eight tasks show that CDCR-SFT improves the causal reasoning capability with the state-of-the-art 95.33% accuracy on CLADDER (surpassing human performance of 94.8% for the first time) and reduces the hallucination on HaluEval with 10% improvements. It demonstrates that explicit causal structure modeling in LLMs can effectively mitigate logical inconsistencies in LLM outputs. Code is available at https://github.com/MrLYG/CDCR-SFT.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2508.12495

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

045c87def0c02e3ad0d3d849766d7f1e-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 00:32:27 GMT

artificial intelligence, experiment, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Learning Causal Graphs with Small Interventions

Karthikeyan Shanmugam, Murat Kocaoglu, Alexandros G. Dimakis, Sriram Vishwanath

Neural Information Processing SystemsOct-2-2025, 12:59:30 GMT

We consider the problem of learning causal networks with interventions, when each intervention is limited in size under Pearl's Structural Equation Model with independent errors (SEM-IE). The objective is to minimize the number of experiments to discover the causal directions of all the edges in a causal graph. Previous work has focused on the use of separating systems for complete graphs for this task. We prove that any deterministic adaptive algorithm needs to be a separating system in order to learn complete graphs in the worst case. In addition, we present a novel separating system construction, whose size is close to optimal and is arguably simpler than previous work in combinatorics.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)

Add feedback

Appendices T able of Contents

Neural Information Processing SystemsAug-15-2025, 16:54:46 GMT

Thm. 4.4 assumes, without loss of generality, that all covariates are standardized to have mean

artificial intelligence, machine learning, predictor, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Causal DAG Summarization (Full Version)

Zeng, Anna, Cafarella, Michael, Kenig, Batya, Markakis, Markos, Youngmann, Brit, Salimi, Babak

arXiv.org Artificial IntelligenceApr-22-2025

Causal inference aids researchers in discovering cause-and-effect relationships, leading to scientific insights. Accurate causal estimation requires identifying confounding variables to avoid false discoveries. Pearl's causal model uses causal DAGs to identify confounding variables, but incorrect DAGs can lead to unreliable causal conclusions. However, for high dimensional data, the causal DAGs are often complex beyond human verifiability. Graph summarization is a logical next step, but current methods for general-purpose graph summarization are inadequate for causal DAG summarization. This paper addresses these challenges by proposing a causal graph summarization objective that balances graph simplification for better understanding while retaining essential causal information for reliable inference. We develop an efficient greedy algorithm and show that summary causal DAGs can be directly used for inference and are more robust to misspecification of assumptions, enhancing robustness for causal inference. Experimenting with six real-life datasets, we compared our algorithm to three existing solutions, showing its effectiveness in handling high-dimensional data and its ability to generate summary DAGs that ensure both reliable causal inference and robustness against misspecifications.

dag, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2504.14937

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report (1.00)

Industry:

Transportation > Passenger (0.46)
Transportation > Air (0.46)
Health & Medicine > Epidemiology (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
(5 more...)

Add feedback

Review for NeurIPS paper: Active Structure Learning of Causal DAGs via Directed Clique Trees

Neural Information Processing SystemsFeb-12-2025, 00:31:20 GMT

They lower bound the minimal number of interventions required to orient any graph in terms of the size of the largest cliques in the essential graph of the causal model.

active structure learning, directed clique tree, neurips paper, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.40)

Add feedback

Review for NeurIPS paper: Active Structure Learning of Causal DAGs via Directed Clique Trees

Neural Information Processing SystemsFeb-8-2025, 07:18:55 GMT

Additional Feedback: - Line 42: "MEC" used before defined - Line 63: Definition of directed cycle looks weird, possibly should be *- instead of *-*? (By this definition, e.g. I.e. is it the actual m(D), or the lower bound provided by Theorem 2? - Appendix, lines 591-593: Please elaborate on the clique intervention lower bound, or provide a reference. The lower bound is indeed kind of nice, but I still disagree with the authors on the clarity of presentation. The claim itself can be presented as a simple combinatorial statement, and the proof does not use any advanced techniques. In particular, I would encourage the authors to make sure that the proofs in the main paper can be followed without reference to the appendix or prior work.

active structure learning, directed clique tree, neurips paper, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.40)

Add feedback