Hesse
Calibeating Prediction-Powered Inference
van der Laan, Lars, Van Der Laan, Mark
We study semisupervised mean estimation with a small labeled sample, a large unlabeled sample, and a black-box prediction model whose output may be miscalibrated. A standard approach in this setting is augmented inverse-probability weighting (AIPW) [Robins et al., 1994], which protects against prediction-model misspecification but can be inefficient when the prediction score is poorly aligned with the outcome scale. We introduce Calibrated Prediction-Powered Inference, which post-hoc calibrates the prediction score on the labeled sample before using it for semisupervised estimation. This simple step requires no retraining and can improve the original score both as a predictor of the outcome and as a regression adjustment for semisupervised inference. We study both linear and isotonic calibration. For isotonic calibration, we establish first-order optimality guarantees: isotonic post-processing can improve predictive accuracy and estimator efficiency relative to the original score and simpler post-processing rules, while no further post-processing of the fitted isotonic score yields additional first-order gains. For linear calibration, we show first-order equivalence to PPI++. We also clarify the relationship among existing estimators, showing that the original PPI estimator is a special case of AIPW and can be inefficient when the prediction model is accurate, while PPI++ is AIPW with empirical efficiency maximization [Rubin et al., 2008]. In simulations and real-data experiments, our calibrated estimators often outperform PPI and are competitive with, or outperform, AIPW and PPI++. We provide an accompanying Python package, ppi_aipw, at https://larsvanderlaan.github.io/ppi-aipw/.
- Asia > Middle East > Jordan (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
- Research Report > Experimental Study (1.00)
- Research Report > Strength High (0.68)
- Research Report > New Finding (0.67)
AI-powered robot beats elite table tennis players
In feat hailed as milestone in robotics, Sony AI's Ace wins three out of five matches played under official rules An AI-powered robot has beaten elite players at table tennis in a significant achievement for a machine faced with human athletes in a real-world competitive sport. Named Ace, the robotic system developed by Sony AI, won three out of five matches against elite players, but lost the two it played against professionals, clawing back only one game in the seven contests. The feat has been hailed as a milestone for robotics, a field that has long seen table tennis - and the lightning-fast reactions, perception and skill it demands - as one of the toughest tests of how far the technology has advanced. In the matches, played under official competition rules, Ace displayed a mastery of spin, handled difficult shots, such as balls catching on the net, and pulled off one rapid backspin shot that a professional had thought impossible. A research paper on the robot was published in Nature on Wednesday, but scientists working on the project said Ace had improved since the report was submitted.
- North America > United States (0.19)
- Europe > Ukraine (0.07)
- Oceania > Australia (0.05)
- (2 more...)
Overcoming Selection Bias in Statistical Studies With Amortized Bayesian Inference
Arruda, Jonas, Chervet, Sophie, Staudt, Paula, Wieser, Andreas, Hoelscher, Michael, Sermet-Gaudelus, Isabelle, Binder, Nadine, Opatowski, Lulla, Hasenauer, Jan
Selection bias arises when the probability that an observation enters a dataset depends on variables related to the quantities of interest, leading to systematic distortions in estimation and uncertainty quantification. For example, in epidemiological or survey settings, individuals with certain outcomes may be more likely to be included, resulting in biased prevalence estimates with potentially substantial downstream impact. Classical corrections, such as inverse-probability weighting or explicit likelihood-based models of the selection process, rely on tractable likelihoods, which limits their applicability in complex stochastic models with latent dynamics or high-dimensional structure. Simulation-based inference enables Bayesian analysis without tractable likelihoods but typically assumes missingness at random and thus fails when selection depends on unobserved outcomes or covariates. Here, we develop a bias-aware simulation-based inference framework that explicitly incorporates selection into neural posterior estimation. By embedding the selection mechanism directly into the generative simulator, the approach enables amortized Bayesian inference without requiring tractable likelihoods. This recasting of selection bias as part of the simulation process allows us to both obtain debiased estimates and explicitly test for the presence of bias. The framework integrates diagnostics to detect discrepancies between simulated and observed data and to assess posterior calibration. The method recovers well-calibrated posterior distributions across three statistical applications with diverse selection mechanisms, including settings in which likelihood-based approaches yield biased estimates. These results recast the correction of selection bias as a simulation problem and establish simulation-based inference as a practical and testable strategy for parameter estimation under selection bias.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- Europe > Germany > Baden-Württemberg > Freiburg (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- (6 more...)
- Information Technology > Enterprise Applications > Customer Relationship Management (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.85)
Virtual Dummies: Enabling Scalable FDR-Controlled Variable Selection via Sequential Sampling of Null Features
Koka, Taulant, Machkour, Jasin, Palomar, Daniel P., Muma, Michael
High-dimensional variable selection, particularly in genomics, requires error-controlling procedures that scale to millions of predictors. The Terminating-Random Experiments (T-Rex) selector achieves false discovery rate (FDR) control by aggregating results of early terminated random experiments, each combining original predictors with i.i.d. synthetic null variables (dummies). At biobank scales, however, explicit dummy augmentation requires terabytes of memory. We demonstrate that this bottleneck is not fundamental. Formalizing the information flow of forward selection through a filtration, we show that compatible selectors interact with unselected dummies solely through projections onto an adaptively evolving low-dimensional subspace. For rotationally invariant dummy distributions, we derive an adaptive stick-breaking construction sampling these projections from their exact conditional distribution given the selection history, thereby eliminating dummy matrix materialization. We prove a pathwise universality theorem: under mild delocalization conditions, selection paths driven by generic standardized i.i.d. dummies converge to the same Gaussian limit. We instantiate the theory through Virtual Dummy LARS (VD-LARS), reducing memory and runtime by several orders of magnitude while preserving the exact selection law and FDR guarantees of the T-Rex selector. Experiments on realistic genome-wide association study data confirm that VD-T-Rex controls FDR and achieves power at scales where all competing methods either fail or time out.
Regularity of Solutions to Beckmann's Parametric Optimal Transport
Gottschalk, Hanno, Riedlinger, Tobias J.
Beckmann's problem in optimal transport minimizes the total squared flux in a continuous transport problem from a source to a target distribution. In this article, the regularity theory for solutions to Beckmann's problem in optimal transport is developed utilizing an unconstrained Lagrangian formulation and solving the variational first order optimality conditions. It turns out that the Lagrangian multiplier that enforces Beckmann's divergence constraint fulfills a Poisson equation and the flux vector field is obtained as the potential's gradient. Utilizing Schauder estimates from elliptic regularity theory, the exact Hölder regularity of the potential, the flux and the flow generating is derived on the basis of Hölder regularity of source and target densities on a bounded, regular domain. If the target distribution depends on parameters, as is the case in conditional (``promptable'') generative learning, we provide sufficient conditions for separate and joint Hölder continuity of the resulting vector field in the parameter and the data dimension. Following a recent result by Belomnestny et al., one can thus approximate such vector fields with deep ReQu neural networks in C^(k,alpha)-Hölder norm. We also show that this approach generalizes to other probability paths, like Fisher-Rao gradient flows.
0b8aff0438617c055eb55f0ba5d226fa-Supplemental.pdf
Inthis supplemental material, wefirst present thedetailed networkarchitecture andparameters of the proposed approach in Sec. A. We further provide more analysis of the proposed method and ablation studies in Sec. B. Section C shows some qualitative results for potential applications of the proposed approach on medical imaging and imaging in astronomy. Figure 6: Illustration of learned deep features.(a) The blurry input and ground truth are shown in Figure 1(a)-(b). However, on may actually wonder whether the feature extraction network acts as a denoiser, leading to the observed robustness of the proposed method to various noise levels.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- Information Technology > Artificial Intelligence (0.49)
- Information Technology > Data Science (0.35)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Monaco (0.04)
- Europe > Italy > Calabria (0.04)
- (2 more...)
- Europe > United Kingdom > Scotland (0.05)
- Europe > Albania > Tirana County > Tirana (0.04)
- Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)
- (17 more...)
- Research Report > New Finding (0.68)
- Personal (0.46)
- Leisure & Entertainment > Sports (1.00)
- Leisure & Entertainment > Games > Computer Games (0.46)
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- North America > Canada (0.04)
- Europe > Germany > Brandenburg > Potsdam (0.04)
- Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)
- Europe > Germany > North Rhine-Westphalia (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Energy (0.46)
- Government (0.46)
- Education (0.46)
- Banking & Finance > Economy (0.45)