Goto

Collaborating Authors

 marginal coverage


Retrospective Counterfactual Prediction by Conditioning on the Factual Outcome: A Cross-World Approach

Bodik, Juraj

arXiv.org Machine Learning

Retrospective causal questions ask what would have happened to an observed individual had they received a different treatment. We study the problem of estimating $μ(x,y)=\mathbb{E}[Y(1)\mid X=x,Y(0)=y]$, the expected counterfactual outcome for an individual with covariates $x$ and observed outcome $y$, and constructing valid prediction intervals under the Neyman-Rubin superpopulation model. This quantity is generally not identified without additional assumptions. To link the observed and unobserved potential outcomes, we work with a cross-world correlation $ρ(x)=cor(Y(1),Y(0)\mid X=x)$; plausible bounds on $ρ(x)$ enable a principled approach to this otherwise unidentified problem. We introduce retrospective counterfactual estimators $\hatμ_ρ(x,y)$ and prediction intervals $C_ρ(x,y)$ that asymptotically satisfy $P[Y(1)\in C_ρ(x,y)\mid X=x, Y(0)=y]\ge1-α$ under standard causal assumptions. Many common baselines implicitly correspond to endpoint choices $ρ=0$ or $ρ=1$ (ignoring the factual outcome or treating the counterfactual as a shifted factual outcome). Interpolating between these cases through cross-world dependence yields substantial gains in both theory and practice.


Elements of Conformal Prediction for Statisticians

Sesia, Matteo, Favaro, Stefano

arXiv.org Machine Learning

Predictive inference is a fundamental task in statistics, traditionally addressed using parametric assumptions about the data distribution and detailed analyses of how models learn from data. In recent years, conformal prediction has emerged as a rapidly growing alternative framework that is particularly well suited to modern applications involving high-dimensional data and complex machine learning models. Its appeal stems from being both distribution-free -- relying mainly on symmetry assumptions such as exchangeability -- and model-agnostic, treating the learning algorithm as a black box. Even under such limited assumptions, conformal prediction provides exact finite-sample guarantees, though these are typically of a marginal nature that requires careful interpretation. This paper explains the core ideas of conformal prediction and reviews selected methods. Rather than offering an exhaustive survey, it aims to provide a clear conceptual entry point and a pedagogical overview of the field.




bcdaaa1aec3ae2aa39542acefdec4e4b-Paper-Conference.pdf

Neural Information Processing Systems

Finally, we want our algorithm to have low computational overhead, so that itcan be applied asawrapper on top ofarbitrary prediction methods, for both regression and classification.


Training Uncertainty

Neural Information Processing Systems

The first subset (in red) is utilized to evaluate a traditional accuracy-basedlossfunction `a,suchasthecrossentropy. This benchmark is based on a loss function designed to incentivize the trained model to produce the smallest possible conformal prediction sets with the desired coverage (e.g., 90% ifα = 0.1). The hybrid training procedure is similar to Algorithm 1, in the sense that it relies on analogous soft-sorting, soft-ranking, and soft-indexing algorithms toevaluate adifferentiable approximation Wi oftheconformity scoreWi in(8). Above, the second equality follows directly from the fact thatS(x,U;π,t), defined in (A2), is by construction increasing in t, and therefore Y / S(x,U;π,1 α) if and only if min{t [0,1]:Y S(x,U;π,t)}>1 α. The proof consists of showing that`a and`u are separately minimized by ˆπ = π,although only approximately inthelatter case.



31b3b31a1c2f8a370206f111127c0dbd-Paper.pdf

Neural Information Processing Systems

This frameworkcanaccommodate almost anychoice of conformity scores, and in fact many different implementations have already been proposed to address ourproblem. However,itremains unclear howtoimplement aconcrete method fromthis broad family that can lead to the most informative possible prediction intervals.


2b2011a7d5396faf5899863d896a3c24-Paper-Conference.pdf

Neural Information Processing Systems

A flexible conformal inference method is developed to construct confidence intervals for the frequencies of queried objects in very large data sets, based on a much smaller sketch of those data.