Uncertainty
FRESH: Information-Geometric Calibration of Patient-Level Models to Aggregate Evidence
Fuller, Franklin, Bertolini, Daniele, Liang, Samantha, Christopher, Jason, Smith, Aaron M.
Many decision in clinical science and epidemiology -- estimating probability of technical success for a clinical trial, assessing comparative effectiveness of two therapies, imputing a placebo effect onto natural history data -- rely on combining sources of information about a clinical cohort that comes from different kinds of studies. Specifically we contrast patient-level sources that provide granular pictures of individual disease course (clinical trial, registries, or electronic health records) with aggregate sources such as published clinical trial results and the TFLs (tables figures and listings). One strategy for combining aggregate with patient-level data sources is to bring each into a common format for a unified analysis. If one wants to maintain the analytic flexibility of patient-level data, then a natural solution is to convert the aggregate data information into a simulated patient-level dataset that recapitulate those aggregate statistics. This is an under-determined inverse problem in that there are many such datasets, and it cannot be well specified without further constraints. FRESH(Fusion of Recent Evidence with Subject Histories) provides a well-defined method for solving this problem, and therefore providing maximal analytic flexibility.
Functional-prior-based approaches to Bayesian PDE-constrained inversion using physics-informed neural networks
Agata, Ryoichiro, Okazaki, Tomohisa
Physics-informed neural networks (PINNs) provide a mesh-free framework for solving PDE-constrained inverse problems, but their extension to Bayesian inversion still faces a fundamental difficulty: prior distributions are typically defined in the weight space of neural networks, whereas physically meaningful prior assumptions are more naturally expressed in function space. In this study, we introduce a unified framework, termed functional-prior-based approaches to Bayesian PDE-constrained inversion using physics-informed neural networks (fpBPINN), to incorporate functional priors into Bayesian PINN-based inversion. We consider two complementary approaches. The first is a functional-prior-informed Bayesian PINN (FPI-BPINN), in which a neural network weight prior is learned to be consistent with a prescribed functional prior, and Bayesian inference is subsequently performed in weight space. The second is function-space particle-based variational inference for PINNs (fParVI-PINN), which performs Bayesian estimation using ParVI directly in function space. We also show that random Fourier features (RFF) play an important role in representing Gaussian functional priors with neural networks and in improving posterior approximation. We applied the proposed approaches to one-dimensional seismic traveltime tomography and two-dimensional Darcy-flow permeability inversion. These numerical experiments showed that both approaches accurately estimated posterior distributions, highlighting the significance of introducing physically interpretable functional priors into Bayesian PINN-based inverse problems. We also identified the contrasting advantages of FPI-BPINN and fParVI-PINN, namely flexibility and accuracy, respectively.
Sample-Mean Anchored Thompson Sampling for Offline-to-Online Learning with Distribution Shift
Li, Bochao, Fu, Yao, Chen, Wei, Kong, Fang
Offline-to-online learning aims to improve online decision-making by leveraging offline logged data. A central challenge in this setting is the distribution shift between offline and online environments. While some existing works attempt to leverage shifted offline data, they largely rely on UCB-type algorithms. Thompson sampling (TS) represents another canonical class of bandit algorithms, well known for its strong empirical performance and naturally suited to offline-to-online learning through its Bayesian formulation. However, unlike UCB indices, posterior samples in TS are not guaranteed to be optimistic with respect to the true arm means. This makes indices constructed from purely online and hybrid data difficult to compare and complicates their use. To address this issue, we propose sample-mean anchored TS (Anchor-TS), which introduces a novel median-based anchoring rule that defines the arm index as the median of an online posterior sample, a hybrid posterior sample, and the online sample mean. The median anchoring systematically corrects bias induced by distribution shift by mitigating over-estimation for suboptimal arms and under-estimation for optimal arms, while exploiting offline information to obtain more accurate estimates when the shift is small. We establish theoretical guarantees showing that the proposed algorithm safely leverages offline data to accelerate online learning, and quantifying how the degree of distribution shift and the size of offline data affect the resulting regret reduction. Extensive experiments demonstrate consistent improvements of our algorithm over baselines.
TabPFN-3: Technical Report
Grinsztajn, Lรฉo, Flรถge, Klemens, Key, Oscar, Birkel, Felix, Jund, Philipp, Roof, Brendan, Manium, Mihir, Bin, Shi, Hoo, null, Bรผhler, Magnus, Garg, Anurag, Safaric, Dominik, Robertson, Jake, Jรคger, Benjamin, Alessi, Simone, Hayler, Adrian, Moroshan, Vladyslav, Purucker, Lennart, Singer, Philipp, Arazi, Alan, Siems, Julien, Metzen, Jan Hendrik, Grab, Georg, Erickson, Nick, Guo, Siyuan, Kalfon, Eliott, Bing, Simon, Salinas, David, Cornu, Clara, Wehrhahn, Lilly Charlotte, Kriuchkova, Diana, Kaya, Kursat, Sidhoum, Lydia, Salmon, Marie, Chen, Jerry, Hulsebos, Madelon, LeCun, Yann, Mรผller, Samuel, Schรถlkopf, Bernhard, Gambhir, Sauraj, Hollmann, Noah, Hutter, Frank
Tabular data underpins most high-value prediction problems in science and industry, and TabPFN has driven the foundation model revolution for this modality. Designed with feedback from our users, TabPFN-3 builds on this foundation to scale state-of-the-art performance to datasets with 1M training rows and substantially reduce training and inference time. Pretrained exclusively on synthetic data from our prior, TabPFN-3 dramatically pushes the frontier of tabular prediction and brings substantial gains on time series, relational, and tabular-text data. On the standard tabular benchmark TabArena, a forward pass of TabPFN-3 outperforms all other models, including tuned and ensembled baselines, by a significant margin, and pareto-dominates the speed/performance frontier. On more diverse datasets, TabPFN-3 ranks first on datasets with many classes, and beats 8-hour-tuned gradient-boosted-tree baselines on datasets up to 1M training rows and 200 features. TabPFN-3 introduces test-time compute scaling to tabular foundation models. Our API offering TabPFN-3-Plus (Thinking) exploits this to beat all non-TabPFN models by over 200 Elo on TabArena, rising to 420 Elo on the largest data subset, and outperforms AutoGluon 1.5 extreme while being 10x faster, without using LLMs, real data, internet search or any other model besides TabPFN. TabPFN-3 extends the capabilities of our models, enabling SOTA prediction on relational data (new SOTA foundation model on RelBenchV1) and tabular-text data (SOTA on TabSTAR via TabPFN-3-Plus); and improves existing integrations: a specialized checkpoint, TabPFN-TS-3, ranks 2nd on the time-series benchmark fev-bench, and SHAP-value computation is up to 120x faster. TabPFN-3 achieves this performance while being up to 20x faster than TabPFN-2.5. In addition, a reduced KV cache and row-chunking scale to 1M rows on one H100 with fast inference speed.
To discretize continually: Mean shift interacting particle systems for Bayesian inference
Belhadji, Ayoub, Sharp, Daniel, Marzouk, Youssef M.
Integration against a probability distribution given its unnormalized density is a central task in Bayesian inference and other fields. We introduce new methods for approximating such expectations with a small set of weighted samples -- i.e., a quadrature rule -- constructed via an interacting particle system that minimizes maximum mean discrepancy (MMD) to the target distribution. These methods extend the classical mean shift algorithm, as well as recent algorithms for optimal quantization of empirical distributions, to the case of continuous distributions. Crucially, our approach creates dynamics for MMD minimization that are invariant to the unknown normalizing constant; they also admit both gradient-free and gradient-informed implementations. The resulting mean shift interacting particle systems converge quickly, capture anisotropy and multi-modality, avoid mode collapse, and scale to high dimensions. We demonstrate their performance on a wide range of benchmark sampling problems, including multi-modal mixtures, Bayesian hierarchical models, PDE-constrained inverse problems, and beyond.
Finite Sample Bounds for Learning with Score Matching
Smedira, Devin, Jayakumar, Abhijith, Misra, Sidhant, Vuffray, Marc, Lokhov, Andrey Y.
Learning of continuous exponential family distributions with unbounded support remains an important area of research for both theory and applications in high-dimensional statistics. In recent years, score matching has become a widely used method for learning exponential families with continuous variables due to its computational ease when compared against maximum likelihood estimation. However, theoretical understanding of the statistical properties of score matching is still lacking. In this work, we provide a non-asymptotic sample complexity analysis for learning the structure of exponential families of polynomials with score matching. The derived sample bounds show a polynomial dependence on the model dimension. These bounds are the first of its kind, as all prior work has shown only asymptotic bounds on the sample complexity.
Language-Induced Priors for Domain Adaptation
Chen, Qiyuan, Zhou, Jiayu, Kontar, Raed Al
Domain adaptation faces a fundamental paradox in the cold-start regime. When target data is scarce, statistical methods fail to distinguish relevant source domains from irrelevant ones, which often leads to negative transfer. In this paper, we address this challenge by leveraging expert textual descriptions of the target domain, a resource that is often available but overlooked. We propose a probabilistic framework that translates these semantic descriptions into a choice model, namely a Language-Induced Prior (LIP), that learns the preferences from a pretrained Large Language Model (LLM). The LIP is then integrated into an Expectation-Maximization algorithm to identify source relevance. Methodologically, this framework is compatible with any parametric model where a likelihood is available. It allows the LIP to guide the selection of sources when target signals are weak, while gradually refining these choices as samples accumulate. Theoretically, we prove that the estimator roughly matches an oracle cold-start MSE under a correct prior, while remaining asymptotically consistent regardless of the quality of the LIP. Empirically, we validated the framework on a descriptive (Gaussian estimation), a predictive (C-MAPSS dataset), and a prescriptive task (MuJoCo hopper).
Fast Rates for Inverse Reinforcement Learning
Schlaginhaufen, Andreas, Kamgarpour, Maryam
We establish novel structural and statistical results for entropy-regularized min-max inverse reinforcement learning (Min-Max-IRL) with linear reward classes in finite-horizon MDPs with Borel state and action spaces. On the structural side, we show that maximum likelihood estimation (MLE) and Min-Max-IRL are equivalent at the population level, and at the empirical level under deterministic dynamics. On the statistical side, exploiting pseudo-self-concordance of the Min-Max-IRL loss, we prove that both the trajectory-level KL divergence and the squared parameter error in the Hessian norm decay at the fast rate $\mathcal{O}(n^{-1})$, where $n$ is the number of expert trajectories. Our guarantees apply under misspecification and require no exploration assumptions. We further extend reward-identifiability results to general Borel spaces and derive novel results on the derivatives of the soft-optimal value function with respect to reward parameters.
In-Context Learning for Data-Driven Censored Inventory Control
Mukherjee, Sohom, Pham, Anh-Duy, Pibernik, Richard, Xu, Yunbei
We study inventory control with decision-dependent censoring, focusing on the censored or repeated newsvendor (R-NV), where each order quantity determines whether demand is fully observed or censored by sales. Existing approaches based on parametric Thompson sampling (TS) can be brittle under prior mismatch, while offline imputation methods need not transfer to online learning. Motivated by the predictive view of decision making, we combine these ideas by taking oracle actions on learned completions of latent demand. We propose in-context generative posterior sampling (ICGPS), which uses modern generative models that are meta-trained offline and deployed online by in-context autoregressive generation. Theoretically, we show that the Bayesian regret of ICGPS with a learned completion kernel is bounded by the Bayesian regret of a TS benchmark with the ideal completion kernel plus a deployment penalty scaling as $\sqrt{T}$ times the square root of the completion mismatch. This yields a plug-in template for operational problems with known TS regret bounds. For R-NV, we derive sublinear Bayesian regret by reducing censored feedback to bandit convex optimization feedback. We also show that, under reasonable coverage and stability assumptions, the online completion mismatch is controlled by the offline censored predictive mismatch, so offline predictive quality transfers to online performance. Practically, we instantiate ICGPS with ChronosFlow, which combines a frozen time-series transformer backbone with a trainable conditional normalizing-flow head for fast censoring-consistent sampling. In benchmark experiments, ChronosFlow-ICGPS matches correctly specified TS, outperforms myopic and UCB-style baselines, and is robust to prior mismatch and distribution shift. ChronosFlow-ICGPS also performs well for the real-world SuperStore dataset, especially under heavy censoring.
Variational predictive resampling
Battaglia, Laura, Cortinovis, Stefano, Holmes, Chris, Frazier, David T., Jewson, Jack
Bayesian inference provides principled uncertainty quantification, but accurate posterior sampling with MCMC can be computationally prohibitive for modern applications. Variational inference (VI) offers a scalable alternative and often yields accurate predictive distributions, but cheap variational families such as mean-field (MF) can produce over-concentrated approximations that miss posterior dependence. We propose variational predictive resampling (VPR), a scalable posterior sampling method that exploits VI's predictive strength within a predictive-resampling framework to better approximate the Bayesian posterior. Given a prior-likelihood pair, VPR repeatedly imputes future observations from the current variational predictive, updates the variational approximation after each imputation, and records the parameter value implied by the completed sample. We establish conditions under which the law of the parameter returned by VPR is well defined and show that its finite-horizon approximation converges to this limit. In a tractable Gaussian location model, we show that VPR with MF variational predictives converges to the exact Bayesian posterior, whereas the optimal MF-VI approximation retains a non-vanishing asymptotic gap. Experiments on linear regression, logistic regression, and hierarchical linear mixed-effects models demonstrate that VPR substantially improves posterior uncertainty quantification and recovers posterior dependence missed by MF-VI, while remaining computationally competitive with, and often more efficient than, MCMC.