efron
Adaptive Confidence Intervals in Efron's Gaussian Two-Groups Model
Wang, Qiaosen, Chai, Shuwen, Gao, Chao
Robust uncertainty quantification is increasingly important in modern data analysis and is often formalized under Huber's model, which allows an $\varepsilon$-fraction of arbitrary corruptions. In many experimental sciences, however, the measurement protocol is well controlled, and contamination is more plausibly introduced upstream. Motivated by this noise-oblivious nature of adversaries, we study confidence intervals for the null location parameter $θ$ in Efron's Gaussian two-groups model, where an unknown fraction $\varepsilon$ of observations have arbitrarily shifted means, but all samples share the same law of additive Gaussian measurement noise with variance $σ^2$. We characterize the minimax-optimal length among confidence intervals with a prescribed coverage level uniformly over the unknown contamination proportion and all noise-oblivious adversaries. Although prior work has shown that the minimax point estimation rate of theta does not deteriorate when $\varepsilon$ becomes unknown, our results reveal that, with a given $σ^2$, the minimax-optimal length of confidence intervals that are adaptive to unknown $\varepsilon$ is of order $σ(n^{-1/4}+\varepsilon^{1/2}/\max\{1, \log(en \varepsilon^2)\}^{1/2})$, which is polynomially worse than the optimal length when $\varepsilon$ is known. When the variance $σ^2$ is also unknown, we show a further degradation: no adaptive confidence interval can be shorter than $Ω(σn^{-1/8})$. Algorithmically, we introduce a Fourier-based certification procedure built on Carathéodory's positive-semidefiniteness constraints. By scanning candidate points and accepting those whose residual characteristic function is certifiably consistent with a Gaussian location mixture, our algorithm attains the minimax lower bound in the known-variance setting and is computable in polynomial time.
AMDP: An Adaptive Detection Procedure for False Discovery Rate Control in High-Dimensional Mediation Analysis
High-dimensional mediation analysis is often associated with a multiple testing problem for detecting significant mediators. Assessing the uncertainty of this detecting process via false discovery rate (FDR) has garnered great interest. To control the FDR in multiple testing, two essential steps are involved: ranking and selection. Existing approaches either construct p-values without calibration or disregard the joint information across tests, leading to conservation in FDR control or non-optimal ranking rules for multiple hypotheses. In this paper, we develop an adaptive mediation detection procedure (referred to as "AMDP") to identify relevant mediators while asymptotically controlling the FDR in high-dimensional mediation analysis. AMDP produces the optimal rule for ranking hypotheses and proposes a data-driven strategy to determine the threshold for mediator selection. This novel method captures information from the proportions of composite null hypotheses and the distribution of p-values, which turns the high dimensionality into an advantage instead of a limitation. The numerical studies on synthetic and real data sets illustrate the performances of AMDP compared with existing approaches.
Design Stability in Adaptive Experiments: Implications for Treatment Effect Estimation
Sengupta, Saikat, Khamaru, Koulik, Ghosh, Suvrojit, Dasgupta, Tirthankar
We study the problem of estimating the average treatment effect (ATE) under sequentially adaptive treatment assignment mechanisms. In contrast to classical completely randomized designs, we consider a setting in which the probability of assigning treatment to each experimental unit may depend on prior assignments and observed outcomes. Within the potential outcomes framework, we propose and analyze two natural estimators for the ATE: the inverse propensity weighted (IPW) estimator and an augmented IPW (AIPW) estimator. The cornerstone of our analysis is the concept of design stability, which requires that as the number of units grows, either the assignment probabilities converge, or sample averages of the inverse propensity scores and of the inverse complement propensity scores converge in probability to fixed, non-random limits. Our main results establish central limit theorems for both the IPW and AIPW estimators under design stability and provide explicit expressions for their asymptotic variances. We further propose estimators for these variances, enabling the construction of asymptotically valid confidence intervals. Finally, we illustrate our theoretical results in the context of Wei's adaptive coin design and Efron's biased coin design, highlighting the applicability of the proposed methods to sequential experimentation with adaptive randomization.
Generalized Bayesian Ensemble Survival Tree (GBEST) model
Ballante, Elena, Muliere, Pietro, Figini, Silvia
This paper proposes a new class of predictive models for survival analysis called Generalized Bayesian Ensemble Survival Tree (GBEST). It is well known that survival analysis poses many different challenges, in particular when applied to small data or censorship mechanism. Our contribution is the proposal of an ensemble approach that uses Bayesian bootstrap and beta Stacy bootstrap methods to improve the outcome in survival application with a special focus on small datasets. More precisely, a novel approach to integrate Beta Stacy Bayesian bootstrap in bagging tree models for censored data is proposed in this paper. Empirical evidence achieved on simulated and real data underlines that our approach performs better in terms of predictive performances and stability of the results compared with classical survival models available in the literature. In terms of methodology our novel contribution considers the adaptation of recent Bayesian ensemble approaches to survival data, providing a new model called Generalized Bayesian Ensemble Survival Tree (GBEST). A further result in terms of computational novelty is the implementation in R of GBEST, available in a public GitHub repository.
Neural-g: A Deep Learning Framework for Mixing Density Estimation
Wang, Shijie, Chakraborty, Saptarshi, Qin, Qian, Bai, Ray
Mixing (or prior) density estimation is an important problem in machine learning and statistics, especially in empirical Bayes $g$-modeling where accurately estimating the prior is necessary for making good posterior inferences. In this paper, we propose neural-$g$, a new neural network-based estimator for $g$-modeling. Neural-$g$ uses a softmax output layer to ensure that the estimated prior is a valid probability density. Under default hyperparameters, we show that neural-$g$ is very flexible and capable of capturing many unknown densities, including those with flat regions, heavy tails, and/or discontinuities. In contrast, existing methods struggle to capture all of these prior shapes. We provide justification for neural-$g$ by establishing a new universal approximation theorem regarding the capability of neural networks to learn arbitrary probability mass functions. To accelerate convergence of our numerical implementation, we utilize a weighted average gradient descent approach to update the network parameters. Finally, we extend neural-$g$ to multivariate prior density estimation. We illustrate the efficacy of our approach through simulations and analyses of real datasets. A software package to implement neural-$g$ is publicly available at https://github.com/shijiew97/neuralG.
Hollywood Faces Its Post-Strike Future
On Wednesday night, the actor Jeremy Allen White, of "The Bear," was working his way down a red carpet in Dallas. It was the première of "The White Claw," an A24 movie about the Von Erich clan of professional wrestlers. On the carpet, an "Entertainment Tonight" reporter informed White, "We just heard moments ago--the strike is over!" and stuck the mike in his face. "That's amazing," White said, seeming taken aback. Asked how he felt, he added, "I don't know the details of the deal, but I'm sure that SAG got what we wanted."
Deconfounding and Causal Regularization for Stability and External Validity
Bühlmann, Peter, Ćevid, Domagoj
Brad Efron, in his lecture at the occasion of receiving the International Prize in Statistics, brought up some fascinating thoughts on "prediction, estimation and attribution", with particular attention to the new "wide data era" which has entered statistics and data science more generally (Efron, 2019, 2020). Looking back almost 20 years ago, there has been a huge development in statistics since Leo Breiman's article "Statistical Modeling: The Two Cultures" (Breiman, 2001). Even more broadly, data science has become an emerging new field and profession. It deals with information extraction from data, often in close proximity with other sciences. Its historical roots are in statistics, and statistical "critical" thinking plays an ever important role in inference from data to models and prediction. There are many interesting facets of this broad topic, see for example David Donoho's "50 years of Data Science" (Donoho, 2017) or Bin Yu's "Veridical Data Science" (Yu and Kumbier, 2020). Efron (2019, 2020) has formulated intriguing ideas on "prediction, estimation and attribution". We are presenting here a few additional considerations on the topic, as outlined in the following Sections 1.1 and 1.2.
Statistical Bootstrapping for Uncertainty Estimation in Off-Policy Evaluation
In reinforcement learning, it is typical to use the empirically observed transitions and rewards to estimate the value of a policy via either model-based or Q-fitting approaches. Although straightforward, these techniques in general yield biased estimates of the true value of the policy. In this work, we investigate the potential for statistical bootstrapping to be used as a way to take these biased estimates and produce calibrated confidence intervals for the true value of the policy. We identify conditions - specifically, sufficient data size and sufficient coverage - under which statistical bootstrapping in this setting is guaranteed to yield correct confidence intervals. In practical situations, these conditions often do not hold, and so we discuss and propose mechanisms that can be employed to mitigate their effects. We evaluate our proposed method and show that it can yield accurate confidence intervals in a variety of conditions, including challenging continuous control environments and small data regimes.