instrumental variable
Automatic Visual Instrumental Variable Learning for Confounding-Resistant Domain Generalization
Many confounding-resistant domain generalization methods for image classification have been developed based on causal interventions. However, their reliance on strong assumptions limits their effectiveness in handling unobserved confounders. Although recent work introduces instrumental variables (IVs) to overcome this limitation, the reliance on manually predefined instruments, particularly in the context of visual data, may result in severe bias or invalidity when IV conditions are violated. To address these issues, we propose a novel approach to automatically learning Visual Instrumental Variables for confounding-resistant Domain Generalization (VIV-DG). We observe that certain non-causal visual attributes in image data naturally satisfy the basic conditions required for valid IVs. Motivated by this insight, we propose the visual instrumental variable, a novel concept that extends classical IV theory to the visual domain. Furthermore, we develop an automatic visual instrumental variable learner that enforces IV conditions on learned representations, enabling the automatic learning of valid visual instrumental variables from image data. Ultimately, VIV-DG inherits the strengths of classical IVs to mitigate unobserved confounding and avoids the significant bias caused by violations of IV conditions in predefined IVs. Extensive experiments on multiple benchmarks verify that VIV-DG achieves superior generalization ability.
Double Machine Learning for Static Panel Data with Instrumental Variables: New Method and Applications
Baiardi, Anna, Clarke, Paul S., Naghi, Andrea A., Polselli, Annalivia
Panel data methods are widely used in empirical analysis to address unobserved heterogeneity, but causal inference remains challenging when treatments are endogenous and confounding variables high-dimensional and potentially nonlinear. Standard instrumental variables (IV) estimators, such as two-stage least squares (2SLS), become unreliable when instrument validity requires flexibly conditioning on many covariates with potentially non-linear effects. This paper develops a Double Machine Learning estimator for static panel models with endogenous treatments (panel IV DML), and introduces weak-identification diagnostics for it. We revisit three influential migration studies that use shift-share instruments. In these settings, instrument validity depends on a rich covariate adjustment. In one application, panel IV DML strengthens the predictive power of the instrument and broadly confirms 2SLS results. In the other cases, flexible adjustment makes the instruments weak, leading to substantially more cautious causal inference than conventional 2SLS. Monte Carlo evidence supports these findings, showing that panel IV DML improves estimation accuracy under strong instruments and delivers more reliable inference under weak identification.
Causal Effect Estimation with Learned Instrument Representations
Dean, Frances, Fields, Jenna, Bhalerao, Radhika, Charpignon, Marie, Alaa, Ahmed
Instrumental variable (IV) methods mitigate bias from unobserved confounding in observational causal inference but rely on the availability of a valid instrument, which can often be difficult or infeasible to identify in practice. In this paper, we propose a representation learning approach that constructs instrumental representations from observed covariates, which enable IV-based estimation even in the absence of an explicit instrument. Our model (ZNet) achieves this through an architecture that mirrors the structural causal model of IVs; it decomposes the ambient feature space into confounding and instrumental components, and is trained by enforcing empirical moment conditions corresponding to the defining properties of valid instruments (i.e., relevance, exclusion restriction, and instrumental unconfoundedness). Importantly, ZNet is compatible with a wide range of downstream two-stage IV estimators of causal effects. Our experiments demonstrate that ZNet can (i) recover ground-truth instruments when they already exist in the ambient feature space and (ii) construct latent instruments in the embedding space when no explicit IVs are available. This suggests that ZNet can be used as a ``plug-and-play'' module for causal inference in general observational settings, regardless of whether the (untestable) assumption of unconfoundedness is satisfied.
A Comparative Study of Model Adaptation Strategies for Multi-Treatment Uplift Modeling
Zhang, Ruyue, Ke, Xiaopeng, Liu, Ming, Shi, Fangzhou, Men, Chang, Zhu, Zhengdan
Uplift modeling has emerged as a crucial technique for individualized treatment effect estimation, particularly in fields such as marketing and healthcare. Modeling uplift effects in multi-treatment scenarios plays a key role in real-world applications. Current techniques for modeling multi-treatment uplift are typically adapted from binary-treatment works. In this paper, we investigate and categorize all current model adaptations into two types: Structure Adaptation and Feature Adaptation. Through our empirical experiments, we find that these two adaptation types cannot maintain effectiveness under various data characteristics (noisy data, mixed with observational data, etc.). To enhance estimation ability and robustness, we propose Orthogonal Function Adaptation (OFA) based on the function approximation theorem. We conduct comprehensive experiments with multiple data characteristics to study the effectiveness and robustness of all model adaptation techniques. Our experimental results demonstrate that our proposed OFA can significantly improve uplift model performance compared to other vanilla adaptation methods and exhibits the highest robustness.
Identification and Debiased Learning of Causal Effects with General Instrumental Variables
Chen, Shuyuan, Zhang, Peng, Cui, Yifan
Instrumental variable methods are fundamental to causal inference when treatment assignment is confounded by unobserved variables. In this article, we develop a general nonparametric framework for identification and learning with multi-categorical or continuous instrumental variables. Specifically, we propose an additive instrumental variable framework to identify mean potential outcomes and the average treatment effect with a weighting function. Leveraging semiparametric theory, we derive efficient influence functions and construct consistent, asymptotically normal estimators via debiased machine learning. Extensions to longitudinal data, dynamic treatment regimes, and multiplicative instrumental variables are further developed. We demonstrate the proposed method by employing simulation studies and analyzing real data from the Job Training Partnership Act program.
Data-Faithful Feature Attribution: Mitigating Unobservable Confounders via Instrumental Variables
The state-of-the-art feature attribution methods often neglect the influence of unobservable confounders, posing a risk of misinterpretation, especially when it is crucial for the interpretation to remain faithful to the data. To counteract this, we propose a new approach, data-faithful feature attribution, which trains a confounder-free model using instrumental variables.
Causal Effect Identification in lvLiNGAM from Higher-Order Cumulants
Tramontano, Daniele, Kivva, Yaroslav, Salehkaleybar, Saber, Drton, Mathias, Kiyavash, Negar
This paper investigates causal effect identification in latent variable Linear Non-Gaussian Acyclic Models (lvLiNGAM) using higher-order cumulants, addressing two prominent setups that are challenging in the presence of latent confounding: (1) a single proxy variable that may causally influence the treatment and (2) underspecified instrumental variable cases where fewer instruments exist than treatments. We prove that causal effects are identifiable with a single proxy or instrument and provide corresponding estimation methods. Experimental results demonstrate the accuracy and robustness of our approaches compared to existing methods, advancing the theoretical and practical understanding of causal inference in linear systems with latent confounders.
Weak instrumental variables due to nonlinearities in panel data: A Super Learner Control Function estimator
A triangular structural panel data model with additive separable individual-specific effects is used to model the causal effect of a covariate on an outcome variable when there are unobservable confounders with some of them time-invariant. In this setup, a linear reduced-form equation might be problematic when the conditional mean of the endogenous covariate and the instrumental variables is nonlinear. The reason is that ignoring the nonlinearity could lead to weak instruments As a solution, we propose a triangular simultaneous equation model for panel data with additive separable individual-specific fixed effects composed of a linear structural equation with a nonlinear reduced form equation. The parameter of interest is the structural parameter of the endogenous variable. The identification of this parameter is obtained under the assumption of available exclusion restrictions and using a control function approach. Estimating the parameter of interest is done using an estimator that we call Super Learner Control Function estimator (SLCFE). The estimation procedure is composed of two main steps and sample splitting. We estimate the control function using a super learner using sample splitting. In the following step, we use the estimated control function to control for endogeneity in the structural equation. Sample splitting is done across the individual dimension. We perform a Monte Carlo simulation to test the performance of the estimators proposed. We conclude that the Super Learner Control Function Estimators significantly outperform Within 2SLS estimators.
Principal-Agent Multitasking: the Uniformity of Optimal Contracts and its Efficient Learning via Instrumental Regression
This work studies the multitasking principal-agent problem. I first show a ``uniformity'' result. Specifically, when the tasks are perfect substitutes, and the agent's cost function is homogeneous to a certain degree, then the optimal contract only depends on the marginal utility of each task and the degree of homogeneity. I then study a setting where the marginal utility of each task is unknown so that the optimal contract must be learned or estimated with observational data. I identify this problem as a regression problem with measurement errors and observe that this problem can be cast as an instrumental regression problem. The current works observe that both the contract and the repeated observations (when available) can act as valid instrumental variables, and propose using the generalized method of moments estimator to compute an approximately optimal contract from offline data. I also study an online setting and show how the optimal contract can be efficiently learned in an online fashion using the two estimators. Here the principal faces an exploration-exploitation tradeoff: she must experiment with new contracts and observe their outcome whilst at the same time ensuring her experimentations are not deviating too much from the optimal contract. This work shows when repeated observations are available and agents are sufficiently ``diverse", the principal can achieve a very low $\widetilde{O}(d)$ cumulative utility loss, even with a ``pure exploitation" algorithm.