outcome distribution
Doubly-Robust Estimation of Counterfactual Policy Mean Embeddings
Estimating the distribution of outcomes under counterfactual policies is critical for decision-making in domains such as recommendation, advertising, and healthcare. We propose and analyze a novel framework--Counterfactual Policy Mean Embedding (CPME)--that represents the entire counterfactual outcome distribution in a reproducing kernel Hilbert space (RKHS), enabling flexible and nonparametric distributional off-policy evaluation. We introduce both a plug-in estimator and a doubly robust estimator; the latter enjoys improved convergence rates by correcting for bias in both the outcome embedding and propensity models. Building on this, we develop a doubly robust kernel test statistic for hypothesis testing, which achieves asymptotic normality and thus enables computationally efficient testing and straightforward construction of confidence intervals. Our framework also supports sampling from the counterfactual distribution. Numerical simulations illustrate the practical benefits of CPME over existing methods.
Extended Wasserstein-GAN Approach to Causal Distribution Learning: Density-Free Estimation and Minimax Optimality
Tamano, Shu, Imaizumi, Masaaki
Distributional causal inference requires estimating not only average treatment effects but also interventional outcome distributions, including quantiles, tail risks, and policy-dependent uncertainty. As a method for distributional causal inference, generative adversarial network (GAN)-based counterfactual methods are flexible tools for this task. However, these methods have several limitations. First, the objectives of certain techniques do not coincide with the statistical risk of the identifiable causal target, and therefore provide limited theoretical guarantees regarding estimable counterfactual distributions or optimality. Second, they tend to rely on unstable density-based methods, such as density ratio estimation. In this paper, we propose GANICE (GAN for Interventional Conditional Estimation) with several advantages: it (i) clarifies the conditional interventional distribution for each treatment--covariate state as the causal estimation target; (ii) estimates the conditional distribution such that its averaged Wasserstein risk is minimized; (iii) establishes minimax optimality. GANICE achieves these advantages through the introduction of the extended Wasserstein distance, the incorporation of a cellwise critic in its dual, and an optimality proof based on Besov space theory. Our experiments demonstrate that GANICE consistently outperforms existing methods.
Distributional Causal Mediation via Conditional Generative Modeling
Zhang, Jinlun, Huang, Haoneng, Zhan, Zishu, Ou, Chunquan
Mediation analysis has traditionally focused on outcome-level summary contrasts, such as mean effects, which may obscure substantial distributional changes induced by complex and nonlinear causal mechanisms. We propose Distributional Causal Mediation Analysis (DCMA), a generative learning framework for identifying and estimating treatment effects on entire outcome distributions transmitted through multiple mediators. DCMA learns conditional generative models for the mediators and the outcome, recovering the relevant conditional distributions from observational data. Leveraging the identification formulas, it reconstructs interventional outcome distributions via Monte Carlo forward simulation by noise resampling, enabling the capture of both classical summary effects and rich distributional contrasts such as energy distance and the Wasserstein distance. Analytical error bounds are derived to decompose how estimation errors in the learned conditional models propagate to the reconstructed interventional outcome distributions. The empirical effectiveness of DCMA is demonstrated through numerical experiments and real-world data applications.
Nonparametric efficient inference for network quantile causal effects under partial interference
Interference arises when the treatment assigned to one individual affects the outcomes of other individuals. Commonly, individuals are naturally grouped into clusters, and interference occurs only among individuals within the same cluster, a setting referred to as partial interference. We study network causal effects on outcome quantiles in the presence of partial interference. We develop a general nonparametric efficiency theory for estimating these network quantile causal effects, which leads to a nonparametrically efficient estimator. The proposed estimator is consistent and asymptotically normal with parametric convergence rates, while allowing for flexible, data-adaptive estimation of complex nuisance functions. We leverage a three-way cross-fitting procedure that avoids direct estimation of the conditional outcome distribution. Simulations demonstrate adequate finite-sample performance of the proposed estimators, and we apply the methods to a clustered observational study.
Are language models aware of the road not taken? Token-level uncertainty and hidden state dynamics
Zur, Amir, Geiger, Atticus, Lubana, Ekdeep Singh, Bigelow, Eric
When a language model generates text, the selection of individual tokens might lead it down very different reasoning paths, making uncertainty difficult to quantify. In this work, we consider whether reasoning language models represent the alternate paths that they could take during generation. To test this hypothesis, we use hidden activations to control and predict a language model's uncertainty during chain-of-thought reasoning. In our experiments, we find a clear correlation between how uncertain a model is at different tokens, and how easily the model can be steered by controlling its activations. This suggests that activation interventions are most effective when there are alternate paths available to the model -- in other words, when it has not yet committed to a particular final answer. We also find that hidden activations can predict a model's future outcome distribution, demonstrating that models implicitly represent the space of possible paths.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The authors give an algorithm for easy partial-monitoring games, ones that satisfy the local observability condition of Bartok et al. Their algorithm BPM attains the O(\sqrt{T}) rate which is minimax optimal for such games. Originality and Significance: There are already algorithms that attain O(\sqrt{T}) regret for easy partial monitoring games. Indeed, the authors compare themselves against the CBP algorithm of Bartok et al.