Causal Learning for Heterogeneous Subgroups Based on Nonlinear Causal Kernel Clustering

Liu, Lu, Tang, Yang, Zhang, Kexuan, Sun, Qiyu

arXiv.org Machine Learning 

Due to the challenge posed by multi-source and heterogeneous data collected from diverse environments, causal relationships among features can exhibit variations influenced by different time spans, regions, or strategies. This diversity makes a single causal model inadequate for accurately representing complex causal relationships in all observational data, a crucial consideration in causal learning. To address this challenge, the nonlinear Causal Kernel Clustering method is introduced for heterogeneous subgroup causal learning, highlighting variations in causal relationships across diverse subgroups. The main component for clustering heterogeneous subgroups lies in the construction of the $u$-centered sample mapping function with the property of unbiased estimation, which assesses the differences in potential nonlinear causal relationships in various samples and supported by causal identifiability theory. Experimental results indicate that the method performs well in identifying heterogeneous subgroups and enhancing causal learning, leading to a reduction in prediction error.