AITopics | notear

When Does Gene Regulatory Network Inference Break? A Controlled Diagnostic Study of Causal and Correlational Methods on Single-Cell Data

Fernandez-de-Retana, Miguel, Sanchez-Corcuera, Ruben, Zulaika, Unai, Bilbao-Jayo, Aritz, Almeida, Aitor

arXiv.org Machine LearningMay-7-2026

Despite theoretical advantages, causal methods for Gene Regulatory Network (GRN) inference from single-cell RNA-seq data consistently fail to match or outperform correlation-based baselines in many realistic benchmarks, a persistent puzzle which casts doubt on the value of causality for this task. We argue that existing benchmarks are insufficiently controlled to answer this question because they evaluate on real or semi-real data where multiple pathologies co-occur, confounding failure modes, and obscuring the specific conditions under which different inference methods excel or fail. To address this gap, we introduce a controlled diagnostic framework that isolates seven biologically motivated pathologies (dropout, latent confounders, cell-type mixing, feedback loops, network density, sample size, and pseudotime drift) and measure how six representative methods spanning three inference paradigms degrade as each pathology intensifies. Across 6,120 controlled experiments, we find that causal methods genuinely dominate in clean and structurally favorable regimes, but specific pathologies (notably dropout and latent confounders) selectively neutralize their advantages. We further introduce an errortype decomposition that reveals methods with similar aggregate accuracy commit qualitatively different errors. To probe whether single-pathology effects persist when multiple stressors co-occur, we perform an interaction sweep over the three most impactful pathologies and find that their joint effects are sub-additive, while also exposing density-conditional cross-overs invisible to single-dial analysis. Our findings offer a nuanced understanding of when and why different methods succeed or fail for GRN inference, providing actionable insights for method development and practical guidance for practitioners.3

artificial intelligence, auprc, machine learning, (19 more...)

arXiv.org Machine Learning

2605.0493

Country: Europe > Spain (0.14)

Genre:

Research Report > Experimental Study (0.86)
Research Report > New Finding (0.66)
Research Report > Strength High (0.54)

Industry: Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Markov Equivalence and Consistency in Differentiable Structure Learning Chang Deng Kevin Bello

Neural Information Processing SystemsFeb-17-2026, 05:42:04 GMT

These results are then generalized to general models and likelihoods, where the same claims hold.

artificial intelligence, assumption, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.92)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

d04d42cdf14579cd294e5079e0745411-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-10-2026, 11:45:28 GMT

constraint, notear, suggestion, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.32)

Add feedback

SimultaneousMissingValueImputation andStructureLearningwithGroups

Neural Information Processing SystemsFeb-10-2026, 04:06:45 GMT

Understanding the structural relationships among different variables provides critical insights in manyreal-worldapplications, suchasmedicine,economics andeducation [42,62]. Thus,learning graphs from observed data, known as structure learning, has recently made remarkable progress [10,61,63,64]. Formanyapplications, variables inthedata can begathered into semantically meaningful groups, where useful insights are at group level. For example, in finance, one may be interested in how a financial situation influences different industries (i.e.

artificial intelligence, arxivpreprintarxiv, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Spain (0.04)

Industry: Education (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.35)

Add feedback

DAGs with No Fears: A Closer Look at Continuous Optimization for Learning Bayesian Networks

Neural Information Processing SystemsDec-23-2025, 21:31:41 GMT

This paper re-examines a continuous optimization framework dubbed NOTEARS for learning Bayesian networks. We first generalize existing algebraic characterizations of acyclicity to a class of matrix polynomials. Next, focusing on a one-parameter-per-edge setting, it is shown that the Karush-Kuhn-Tucker (KKT) optimality conditions for the NOTEARS formulation cannot be satisfied except in a trivial case, which explains a behavior of the associated algorithm. We then derive the KKT conditions for an equivalent reformulation, show that they are indeed necessary, and relate them to explicit constraints that certain edges be absent from the graph. If the score function is convex, these KKT conditions are also sufficient for local minimality despite the non-convexity of the constraint. Informed by the KKT conditions, a local search post-processing algorithm is proposed and shown to substantially and universally improve the structural Hamming distance of all tested algorithms, typically by a factor of 2 or more. Some combinations with local search are both more accurate and more efficient than the original NOTEARS.

continuous optimization, learning bayesian network, name change, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

367ab3106d990825d5b47ce91db75a73-Paper-Conference.pdf

Neural Information Processing SystemsNov-15-2025, 05:53:01 GMT

assumption, dag, failure 0, (15 more...)

Neural Information Processing Systems

Country: Europe > Switzerland > Zürich > Zürich (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Differentiable Constraint-Based Causal Discovery

Zhou, Jincheng, Wang, Mengbo, He, Anqi, Zhou, Yumeng, Olya, Hessam, Kocaoglu, Murat, Ribeiro, Bruno

arXiv.org Machine LearningOct-28-2025

Causal discovery from observational data is a fundamental task in artificial intelligence, with far-reaching implications for decision-making, predictions, and interventions. Despite significant advances, existing methods can be broadly categorized as constraint-based or score-based approaches. Constraint-based methods offer rigorous causal discovery but are often hindered by small sample sizes, while score-based methods provide flexible optimization but typically forgo explicit conditional independence testing. This work explores a third avenue: developing differentiable $d$-separation scores, obtained through a percolation theory using soft logic. This enables the implementation of a new type of causal discovery method: gradient-based optimization of conditional independence constraints. Empirical evaluations demonstrate the robust performance of our approach in low-sample regimes, surpassing traditional constraint-based and score-based baselines on a real-world dataset. Code and data of the proposed method are publicly available at https://github$.$com/PurdueMINDS/DAGPA.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2510.22031

Country: North America > United States (0.92)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

FinCARE: Financial Causal Analysis with Reasoning and Evidence

Michel, Alejandro, Arun, Abhinav, Sarmah, Bhaskarjit, Pasquali, Stefano

arXiv.org Artificial IntelligenceOct-24-2025

Portfolio managers rely on correlation-based analysis and heuristic methods that fail to capture true causal relationships driving performance. We present a hybrid framework that integrates statistical causal discovery algorithms with domain knowledge from two complementary sources: a financial knowledge graph extracted from SEC 10-K filings and large language model reasoning. Our approach systematically enhances three representative causal discovery paradigms, constraint-based (PC), score-based (GES), and continuous optimization (NOTEARS), by encoding knowledge graph constraints algorithmically and leveraging LLM conceptual reasoning for hypothesis generation. Evaluated on a synthetic financial dataset of 500 firms across 18 variables, our KG+LLM-enhanced methods demonstrate consistent improvements across all three algorithms: PC (F1: 0.622 vs. 0.459 baseline, +36%), GES (F1: 0.735 vs. 0.367, +100%), and NOTEARS (F1: 0.759 vs. 0.163, +366%). The framework enables reliable scenario analysis with mean absolute error of 0.003610 for counterfactual predictions and perfect directional accuracy for intervention effects. It also addresses critical limitations of existing methods by grounding statistical discoveries in financial domain expertise while maintaining empirical validation, providing portfolio managers with the causal foundation necessary for proactive risk management and strategic decision-making in dynamic market environments.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.20221

Country: