Goto

Collaborating Authors

 supplement


Leveraging heterogeneity for identifiability: Bayesian order-based learning of multiple DAGs

arXiv.org Machine Learning

We propose a joint order-based scoring framework for causal structure learning of directed acyclic graph (DAG) models under heterogeneous data settings. We show that leveraging heterogeneity improves the accuracy of causal ordering estimation. In the most favorable case, the causal ordering is identifiable up to two permutations. Building on this framework, we propose an order-based Bayesian method for Gaussian DAG models and establish its theoretical properties in the high-dimensional regime. For posterior inference over the space of orderings, we introduce a random-to-random (R2R) proposal neighborhood for the Metropolis-Hastings algorithm, which is theoretically motivated and exhibits efficient mixing behavior. Simulation studies confirm the strong empirical performance of the proposed method, and an application to single-nucleus RNA sequencing data from major depressive disorder demonstrates practical utility.







Supplement to " Uniform Concentration Bounds toward a Unified Framework for Robust Clustering "

Neural Information Processing Systems

For the theoretical exposition, we first establish the following Lemmas. Lemma A.1 proves that the derivative of the function ฯ†is bounded in the `2-norm when the domain is restricted to the support of P. Lemma A.1. Lemma A.3 proves that the function fฮ˜, as a function of ฮ˜, is Lipschitz with respect to the k k norm. Joint first authors contributed equally Corresponding author 35th Conference on Neural Information Processing Systems (NeurIPS 2021). Thus, from equation (1), h ฯ†(PC(ฮธ)) ฯ†(ฮธ),x PC(ฮธ)i 0. (2) We now observe that, dฯ†(x,ฮธ) dฯ†(x,PC(ฮธ)) dฯ†(PC(ฮธ),ฮธ) = h ฯ†(PC(ฮธ)) ฯ†(ฮธ),x PC(ฮธ)i 0. Hence the result.



Concentration inequalities under sub-Gaussian and sub-exponential conditions

Neural Information Processing Systems

We prove analogues of the popular bounded difference inequality (also called McDiarmid's inequality) for functions of independent random variables under subGaussian and sub-exponential conditions. Applied to vector-valued concentration and the method of Rademacher complexities these inequalities allow an easy extension of uniform convergence results for PCA and linear regression to the case potentially unbounded input-and output variables.