Goto

Collaborating Authors

 effect estimation






Do Finetti: On Causal Effects for Exchangeable Data

Neural Information Processing Systems

We study causal effect estimation in a setting where the data are not i.i.d.$\ $(independent and identically distributed). We focus on exchangeable data satisfying an assumption of independent causal mechanisms. Traditional causal effect estimation frameworks, e.g., relying on structural causal models and do-calculus, are typically limited to i.i.d.


Front-door Adjustment Beyond Markov Equivalence with Limited Graph Knowledge

Neural Information Processing Systems

Causal effect estimation from data typically requires assumptions about the cause-effect relations either explicitly in the form of a causal graph structure within the Pearlian framework, or implicitly in terms of (conditional) independence statements between counterfactual variables within the potential outcomes framework. When the treatment variable and the outcome variable are confounded, front-door adjustment is an important special case where, given the graph, causal effect of the treatment on the target can be estimated using post-treatment variables. However, the exact formula for front-door adjustment depends on the structure of the graph, which is difficult to learn in practice. In this work, we provide testable conditional independence statements to compute the causal effect using front-door-like adjustment without knowing the graph under limited structural side information. We show that our method is applicable in scenarios where knowing the Markov equivalence class is not sufficient for causal effect estimation. We demonstrate the effectiveness of our method on a class of random graphs as well as real causal fairness benchmarks.


Multiple Treatments Causal Effects Estimation with Task Embeddings and Balanced Representation Learning

Murakami, Yuki, Hattori, Takumi, Kubota, Kohsuke

arXiv.org Artificial Intelligence

The simultaneous application of multiple treatments is increasingly common in many fields, such as healthcare and marketing. In such scenarios, it is important to estimate the single treatment effects and the interaction treatment effects that arise from treatment combinations. Previous studies have proposed using independent outcome networks with subnetworks for interactions, or combining task embedding networks that capture treatment similarity with variational autoencoders. However, these methods suffer from the lack of parameter sharing among related treatments, or the estimation of unnecessary latent variables reduces the accuracy of causal effect estimation. To address these issues, we propose a novel deep learning framework that incorporates a task embedding network and a representation learning network with the balancing penalty. The task embedding network enables parameter sharing across related treatment patterns because it encodes elements common to single effects and contributions specific to interaction effects. The representation learning network with the balancing penalty learns representations nonparametrically from observed covariates while reducing distances in representation distributions across different treatment patterns. This process mitigates selection bias and avoids model misspecification. Simulation studies demonstrate that the proposed method outperforms existing baselines, and application to real-world marketing datasets confirms the practical implications and utility of our framework.


PITE: Multi-Prototype Alignment for Individual Treatment Effect Estimation

Cao, Fuyuan, Zhang, Jiaxuan, Li, Xiaoli

arXiv.org Artificial Intelligence

Estimating Individual Treatment Effects (ITE) from observational data is challenging due to confounding bias. Most studies tackle this bias by balancing distributions globally, but ignore individual heterogeneity and fail to capture the local structure that represents the natural clustering among individuals, which ultimately compromises ITE estimation. While instance-level alignment methods consider heterogeneity, they similarly overlook the local structure information. To address these issues, we propose an end-to-end Multi-\textbf{P}rototype alignment method for \textbf{ITE} estimation (\textbf{PITE}). PITE effectively captures local structure within groups and enforces cross-group alignment, thereby achieving robust ITE estimation. Specifically, we first define prototypes as cluster centroids based on similar individuals under the same treatment. To identify local similarity and the distribution consistency, we perform instance-to-prototype matching to assign individuals to the nearest prototype within groups, and design a multi-prototype alignment strategy to encourage the matched prototypes to be close across treatment arms in the latent space. PITE not only reduces distribution shift through fine-grained, prototype-level alignment, but also preserves the local structures of treated and control groups, which provides meaningful constraints for ITE estimation. Extensive evaluations on benchmark datasets demonstrate that PITE outperforms 13 state-of-the-art methods, achieving more accurate and robust ITE estimation.


A Comparative Study of Model Adaptation Strategies for Multi-Treatment Uplift Modeling

Zhang, Ruyue, Ke, Xiaopeng, Liu, Ming, Shi, Fangzhou, Men, Chang, Zhu, Zhengdan

arXiv.org Artificial Intelligence

Uplift modeling has emerged as a crucial technique for individualized treatment effect estimation, particularly in fields such as marketing and healthcare. Modeling uplift effects in multi-treatment scenarios plays a key role in real-world applications. Current techniques for modeling multi-treatment uplift are typically adapted from binary-treatment works. In this paper, we investigate and categorize all current model adaptations into two types: Structure Adaptation and Feature Adaptation. Through our empirical experiments, we find that these two adaptation types cannot maintain effectiveness under various data characteristics (noisy data, mixed with observational data, etc.). To enhance estimation ability and robustness, we propose Orthogonal Function Adaptation (OFA) based on the function approximation theorem. We conduct comprehensive experiments with multiple data characteristics to study the effectiveness and robustness of all model adaptation techniques. Our experimental results demonstrate that our proposed OFA can significantly improve uplift model performance compared to other vanilla adaptation methods and exhibits the highest robustness.


A Relative Error-Based Evaluation Framework of Heterogeneous Treatment Effect Estimators

Guo, Jiayi, Li, Haoxuan, Tian, Ye, Wu, Peng

arXiv.org Machine Learning

While significant progress has been made in heterogeneous treatment effect (HTE) estimation, the evaluation of HTE estimators remains underdeveloped. In this article, we propose a robust evaluation framework based on relative error, which quantifies performance differences between two HTE estimators. We first derive the key theoretical conditions on the nuisance parameters that are necessary to achieve a robust estimator of relative error. Building on these conditions, we introduce novel loss functions and design a neural network architecture to estimate nuisance parameters and obtain robust estimation of relative error, thereby achieving reliable evaluation of HTE estimators. We provide the large sample properties of the proposed relative error estimator. Furthermore, beyond evaluation, we propose a new learning algorithm for HTE that leverages both the previously HTE estimators and the nuisance parameters learned through our neural network architecture. Extensive experiments demonstrate that our evaluation framework supports reliable comparisons across HTE estimators, and the proposed learning algorithm for HTE exhibits desirable performance.