AITopics

2606.18011

Genre: Research Report > Experimental Study (0.47)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

arXiv.org Machine LearningMay-12-2026

Fourier Feature Methods for Nonlinear Causal Discovery: FFML Scoring, TRFF Scoring, and FFCI Testing in Mixed Data

Ramsey, Joseph D.

Gaussian process (GP) marginal likelihood scores and kernel conditional independence tests are theoretically appealing for nonlinear causal discovery but computationally prohibitive at scale. We present three complementary RFF-based methods forming a practical toolkit for score-based, constraint-based, and hybrid causal discovery. The Fourier Feature Marginal Likelihood (FFML) score approximates the exact GP marginal likelihood by replacing the $n x n$ kernel Gram matrix with a finite-dimensional feature representation, reducing cost to $O(nm^2 + m^3)$ while retaining the probabilistic interpretation and automatic complexity penalty of the exact score. FFML extends to mixed (continuous and discrete) parent sets via a product-kernel construction, with a Kronecker path for small discrete parent sets and a Hadamard-product path otherwise. The Tetrad Random Fourier Feature (TRFF) score is a complementary BIC-style alternative using penalized Student-t regression with random Fourier features. TRFF offers robustness to heavy-tailed noise and faster runtime than FFML. Empirically, TRFF and FFML exhibit a complementary precision-recall profile: TRFF achieves higher precision while FFML achieves better recall and lower SHD overall. The Fourier Feature Conditional Independence (FFCI) test is a fast nonparametric CI test for mixed data, using ridge residualization in feature space and a Frobenius-norm cross-covariance statistic approximated as a weighted sum of chi-squared variables. Empirically, BOSS+FFML achieves the lowest SHD on nonlinear data, while BOSS+TRFF offers the highest precision. When run through PC-Max, FFCI and RCIT exhibit complementary precision-recall profiles: RCIT is more precise while FFCI achieves better recall and substantially lower SHD, at approximately twice the runtime.

artificial intelligence, ffml, machine learning, (16 more...)

2605.05743

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)

Monés, Marc Franquesa, Zhang, Jiaqi, Uhler, Caroline

On the Number of Conditional Independence Tests in Constraint-based Causal Discovery

arXiv.org Machine LearningMar-24-2026

Learning causal relations from observational data is a fundamental problem with wide-ranging applications across many fields. Constraint-based methods infer the underlying causal structure by performing conditional independence tests. However, existing algorithms such as the prominent PC algorithm need to perform a large number of independence tests, which in the worst case is exponential in the maximum degree of the causal graph. Despite extensive research, it remains unclear if there exist algorithms with better complexity without additional assumptions. Here, we establish an algorithm that achieves a better complexity of $p^{\mathcal{O}(s)}$ tests, where $p$ is the number of nodes in the graph and $s$ denotes the maximum undirected clique size of the underlying essential graph. Complementing this result, we prove that any constraint-based algorithm must perform at least $2^{Ω(s)}$ conditional independence tests, establishing that our proposed algorithm achieves exponent-optimality up to a logarithmic factor in terms of the number of conditional independence tests needed. Finally, we validate our theoretical findings through simulations, on semi-synthetic gene-expression data, and real-world data, demonstrating the efficiency of our algorithm compared to existing methods in terms of number of conditional independence tests needed.

artificial intelligence, graph, machine learning, (16 more...)

2603.21844

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Virginia > Arlington County > Arlington (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Neural Information Processing SystemsFeb-16-2026, 17:35:35 GMT

Conditional independence testing under misspecified inductive biases

Then, we study the performance of regression-based CI tests under misspecified inductive biases.

artificial intelligence, machine learning, type-i error control, (14 more...)

Country:

North America > United States > Michigan (0.04)
North America > United States > Texas (0.04)
North America > United States > Missouri (0.04)
(2 more...)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Alexis Bellot, Mihaela van der Schaar

Conditional Independence Testing using Generative Adversarial Networks

Neural Information Processing SystemsFeb-14-2026, 14:44:40 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, conditional independence, machine learning, (15 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Mali (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.84)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Neural Information Processing SystemsFeb-11-2026, 08:55:31 GMT

48db67447e92539501bd71645ff33b72-Paper-Conference.pdf

cmi, dataset, estimator, (12 more...)

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania > New Zealand (0.04)
North America > United States > North Carolina (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)

Neural Information Processing SystemsFeb-11-2026, 04:43:27 GMT

Differentiable Causal Discovery from Interventional Data

Anonymous

The inference of causal relationships is a problem of fundamental interest in science. In all fields ofresearch, experiments are systematically performed with the goal ofelucidating the underlying causal dynamics ofsystems.

arXiv.org Machine LearningFeb-10-2026

Fast Flow Matching based Conditional Independence Tests for Causal Discovery

Zhao, Shunyu, Yang, Yanfeng, Li, Shuai, Fukumizu, Kenji

Constraint-based causal discovery methods require a large number of conditional independence (CI) tests, which severely limits their practical applicability due to high computational complexity. Therefore, it is crucial to design an algorithm that accelerates each individual test. To this end, we propose the Flow Matching-based Conditional Independence Test (FMCIT). The proposed test leverages the high computational efficiency of flow matching and requires the model to be trained only once throughout the entire causal discovery procedure, substantially accelerating causal discovery. According to numerical experiments, FMCIT effectively controls type-I error and maintains high testing power under the alternative hypothesis, even in the presence of high-dimensional conditioning sets. In addition, we further integrate FMCIT into a two-stage guided PC skeleton learning framework, termed GPC-FMCIT, which combines fast screening with guided, budgeted refinement using FMCIT. This design yields explicit bounds on the number of CI queries while maintaining high statistical power. Experiments on synthetic and real-world causal discovery tasks demonstrate favorable accuracy-efficiency trade-offs over existing CI testing methods and PC variants.

artificial intelligence, conditional independence test, independence test, (9 more...)

2602.08315

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)

Neural Information Processing SystemsFeb-9-2026, 00:09:07 GMT

removal

Forpredictivemodels toprovidereliable guidance indecision making processes, they are often required to be accurate and robust to distribution shifts. Shortcut learning-where a model relies on spurious correlations or shortcuts to predict thetargetlabel-undermines therobustnessproperty,leadingtomodelswithpoor out-of-distribution accuracy despite good in-distribution performance.

artificial intelligence, machine learning, shortcut, (18 more...)

Country: North America > United States > California (0.04)

Genre: Research Report (0.46)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsDec-26-2025, 01:27:00 GMT

Causal de Finetti: On the Identification of Invariant Causal Structure in Exchangeable Data

Just as the majority of machine learning methods, existing work focuses on studying $\textit{independent and identically distributed}$ data. However, it is known that even with infinite $i.i.d.\$ data, constraint-based methods can only identify causal structures up to broad Markov equivalence classes, posing a fundamental limitation for causal discovery. In this work, we observe that exchangeable data contains richer conditional independence structure than $i.i.d.\$ data, and show how the richer structure can be leveraged for causal discovery. We first present causal de Finetti theorems, which state that exchangeable distributions with certain non-trivial conditional independences can always be represented as $\textit{independent causal mechanism (ICM)}$ generative processes. We then present our main identifiability theorem, which shows that given data from an ICM generative process, its unique causal structure can be identified through performing conditional independence tests. We finally develop a causal discovery algorithm and demonstrate its applicability to inferring causal relationships from multi-environment data.

identification, invariant causal structure, name change, (9 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.77)