neg
DSCS: Fast CPDAG-Based Verification of Collapsible Submodels in High-Dimensional Bayesian Networks
Bayesian networks (BNs), represented by directed acyclic graphs (DAGs), provide a principled framework for modeling complex dependencies among random variables. As data dimensionality increases into the tens of thousands, fitting and marginalizing a full BN becomes computationally prohibitive--particularly when inference is only needed for a small subset of variables. Estimation-collapsibility addresses this challenge by ensuring that directly fitting a submodel, obtained by ignoring non-essential variables, still yields exact inference on target variables. However, current DAG-based criterion for checking estimation-collapsibility is computationally intensive, involving exhaustive vertex searches and iterative removals. Additionally, practical applications typically identify the underlying DAG only up to its Markov equivalence class, represented by a completed partially directed acyclic graph (CPDAG). To bridge this gap, we introduce sequential c-simplicial sets--a novel graphical characterization of estimation-collapsibility directly applicable to CPDAGs. We further propose DSCS, a computationally efficient algorithm for verifying estimation-collapsibility within CPDAG framework that scales effectively to high-dimensional BNs. Extensive numerical experiments demonstrate the practicality, scalability, and efficiency of our proposed approach.
Contrastive Conformal Sets
Alkhatib, Yahya, Tay, Wee Peng
Contrastive learning produces coherent semantic feature embeddings by encouraging positive samples to cluster closely while separating negative samples. However, existing contrastive learning methods lack principled guarantees on coverage within the semantic feature space. We extend conformal prediction to this setting by introducing minimum-volume covering sets equipped with learnable generalized multi-norm constraints. We propose a method that constructs conformal sets guaranteeing user-specified coverage of positive samples while maximizing negative sample exclusion. We establish theoretically that volume minimization serves as a proxy for negative exclusion, enabling our approach to operate effectively even when negative pairs are unavailable. The positive inclusion guarantee inherits the distribution-free coverage property of conformal prediction, while negative exclusion is maximized through learned set geometry optimized on a held-out training split. Experiments on simulated and real-world image datasets demonstrate improved inclusion-exclusion trade-offs compared to standard distance-based conformal baselines.
cf0d02ec99e61a64137b8a2c3b03e030-Supplemental.pdf
Lemma 5. Let S = (Z1,...,Zn) be a collection ofn independent random variables andฮฆ be an arbitrary random variable defined on the same probability space. Furthermore, each of these summands has zero mean. Given a deterministic algorithmf, we consider the algorithm that adds Gaussian noise to the predictionsoff: fฯ(z,x,R)=f(z,x)+ฮพ, (151) whereฮพ N(0,ฯ2Id). Thefigureinthemiddle repeats the experiment of Figure 1a while making the training algorithm stochastic by randomizing the seed. Table 1: The architecture of the 4-layer convolutional neural network used in MNIST 4 vs 9 classification tasks.
A Proofs of the Main Results
This section describes Stein variational gradient descent (SVGD) by Liu and Wang [19]. The overview is meant as supplementary material for Section 5, where we propose to use SVGD for inferring the DiBS posteriors p(Z | D) and p(Z, ฮ | D). In contrast to sampling-based MCMC or optimizationbased variational inference methods, SVGD iteratively transports a fixed set of particles to closely match a target distribution, akin to the gradient descent algorithm in optimization. We refer the reader to Liu and Wang [19] for additional details. Let p(x) with x X be a differentiable density that we want to sample from, e.g., to estimate an expectation.