AITopics | conditional independence

6739d8df16b5bce3587ca5f18662a6aa-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-26-2026, 14:09:21 GMT

Here we provide proofs of the statements made in the main text as well as further figures of numerical experiments and a more detailed discussion of heteroskedasticity effects regarding causal discovery. Let (Xi,Yi)i=1,...,n be an independent sample with Pearson correlation coefficient ρ, and we assume the linear model Yi = Xiβ +h(Zi)ϵi, where Zi and ϵi are independent and standard normal, and his the noise scaling function. Z. Testing whether the Pearson correlation between X and Y is zero is equivalent to testing whether the slope parameter β is equal to zero. Therefore, this is a homoskedastic problem. A.1.2 Discussion of Effect 2: We start by discussing the homoskedastic case to see where non-constant variance of noise leads to problems within the t-test.

artificial intelligence, heteroskedasticity, machine learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.55)

Add feedback

Useful Facts

Neural Information Processing SystemsApr-24-2026, 10:32:24 GMT

A.1 Relation of Inverse Covariance Matrix and Partial Correlation For a covariance matrix of joint distribution for variables X,Y, the covariance matrix is The derivation comes from the following: Lemma A.1 (Conditional independence (Adapted from [34])). Notice for arbitrary function f, E[f(X)|Y] = EL[f(X)|φy(Y)] with one-hot encoding of discrete variable Y. Therefore for any feature map we can also get that conditional independence ensures: This thus finishes the proof for Lemma D.4. A.3 Technical Facts for Matrix Concentration We include this covariance concentration result that is adapted from Claim A.2 in [18]: Claim A.2 (covariance concentration for gaussian variables). Let X = [x1,x2, xn]> Rn d where each xi N(0,ΣX). Then for any given matrix B Rd m that is of rank kand is independent of X, with probability at least 1 δ10 over X we have 0.9B>ΣXB 1 n B>X>XB 1.1B>ΣXB. Let X = [x1,x2, xn]> Rn d where each xi is ρ2-sub-gaussian. Then for any given matrix B Rd m that is of rank kand is independent of X, with probability at least 1 δ10 over X we have 0.9B>ΣXB 1 n B>X>XB 1.1B>ΣXB. Let Z Rn k be a matrix with row vectors sampled from i.i.d Gaussian distribution N(0,ΣZ). Let P Rn n be a fixed projection onto a space of dimension d.

artificial intelligence, machine learning, xdown1, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Add feedback

02e656adee09f8394b402d9958389b7d-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 10:32:21 GMT

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

06b71ad997f7e3e4b2e2f2ea12e5a759-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 09:48:45 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America (0.46)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(4 more...)

Add feedback

Ancestral Causal Inference

Sara Magliacane, Tom Claassen, Joris M. Mooij

Neural Information Processing SystemsApr-22-2026, 13:44:34 GMT

Constraint-based causal discovery from limited data is a notoriously difficult challenge due to the many borderline independence test decisions. Several approaches to improve the reliability of the predictions by exploiting redundancy in the independence information have been proposed recently. Though promising, existing approaches can still be greatly improved in terms of accuracy and scalability. We present a novel method that reduces the combinatorial explosion of the search space by using a more coarse-grained representation of causal information, drastically reducing computation time. Additionally, we propose a method to score causal predictions based on their confidence. Crucially, our implementation also allows one to easily combine observational and interventional data and to incorporate various types of available background knowledge. We prove soundness and asymptotic consistency of our method and demonstrate that it can outperform the state-ofthe-art on synthetic data, achieving a speedup of several orders of magnitude. We illustrate its practical feasibility by applying it to a challenging protein data set.

artificial intelligence, machine learning, relation, (18 more...)

Neural Information Processing Systems

Country: Europe (0.93)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Binary Expansion Group Intersection Network

Zhou, Sicheng, Zhang, Kai

arXiv.org Machine LearningMar-30-2026

Conditional independence is central to modern statistics, but beyond special parametric families it rarely admits an exact covariance characterization. We introduce the binary expansion group intersection network (BEGIN), a distribution-free graphical representation for multivariate binary data and bit-encoded multinomial variables. For arbitrary binary random vectors and bit representations of multinomial variables, we prove that conditional independence is equivalent to a sparse linear representation of conditional expectations, to a block factorization of the corresponding interaction covariance matrix, and to block diagonality of an associated generalized Schur complement. The resulting graph is indexed by the intersection of multiplicative groups of binary interactions, yielding an analogue of Gaussian graphical modeling beyond the Gaussian setting. This viewpoint treats data bits as atoms and local BEGIN molecules as building blocks for large Markov random fields. We also show how dyadic bit representations allow BEGIN to approximate conditional independence for general random vectors under mild regularity conditions. A key technical device is the Hadamard prism, a linear map that links interaction covariances to group structure.

artificial intelligence, conditional independence, machine learning, (17 more...)

arXiv.org Machine Learning

2603.24763

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
Asia > Middle East > Jordan (0.05)
North America > United States > North Carolina > Orange County > Chapel Hill (0.04)

Genre: Research Report (0.64)

Industry: Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.50)

Add feedback

Notes on Forré's Notion of Conditional Independence and Causal Calculus for Continuous Variables

Chen, Leihao

arXiv.org Machine LearningMar-26-2026

Recently, Forré (arXiv:2104.11547, 2021) introduced transitional conditional independence, a notion of conditional independence that provides a unified framework for both random and non-stochastic variables. The original paper establishes a strong global Markov property connecting transitional conditional independencies with suitable graphical separation criteria for directed mixed graphs with input nodes (iDMGs), together with a version of causal calculus for iDMGs in a general measure-theoretic setting. These notes aim to further illustrate the motivations behind this framework and its connections to the literature, highlight certain subtlies in the general measure-theoretic causal calculus, and extend the "one-line" formulation of the ID algorithm of Richardson et al. (Ann. Statist. 51(1):334--361, 2023) to the general measure-theoretic setting.

artificial intelligence, conditional independence, independence, (16 more...)

arXiv.org Machine Learning

2603.24333

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > Louisiana > East Baton Rouge Parish > Baton Rouge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.41)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.91)

Add feedback

Causal discovery with endogenous context variables

Neural Information Processing SystemsMar-19-2026, 20:10:15 GMT

Systems with variations of the underlying generating mechanism between different contexts, i.e., different environments or internal states in which the system operates, are common in the real world, such as soil moisture regimes in Earth science. Besides understanding the shared properties of the system, in practice, the question of context-specific properties, i.e., the change in causal relationships between contexts, arises. For real-world data, contexts are often driven by system variables, e.g., precipitation highly influences soil moisture. Nevertheless, this setup needs to be studied more. To account for such endogenous contexts in causal discovery, our work proposes a constraint-based method that can efficiently discover context-specific causal graphs using an adaptive testing approach. Our approach tests conditional independence on the pooled datasets to infer the dependence between system variables, including the context, to avoid introducing selection bias. To yield context-specific insights, conditional independence is tested on context-specific data. We work out the theoretical framework for this adaptive testing approach and give a detailed discussion of the connection to structural causal models, including sufficiency assumptions, which allow to prove the soundness of our algorithm and to interpret the results causally. A simulation study to evaluate numerical properties shows that our approach behaves as expected, but also leads to a further understanding of current limitations and viable extensions.

artificial intelligence, name change, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Constraint-based Causal Structure Learning with Consistent Separating Sets

Neural Information Processing SystemsFeb-14-2026, 20:16:44 GMT

This paper concerns, more specifically, the inconsistency of separating sets used to remove dispensable edges, iteratively, based on conditional independence tests.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: