bin
Proper Calibeating
The classic concept of "calibrated forecasts" and its more recent refinement, "calibeating," are defined with respect to the standard quadratic scoring rule. We extend these notions to the class of $\textit{proper}$ scoring rules (for which the best forecast is the true distribution) and define $\textit{proper-calibration}$ and $\textit{proper-calibeating}$ by requiring the errors to converge to zero uniformly over all bounded proper scoring rules. We first establish that calibration always implies proper-calibration, whereas calibeating need not imply proper-calibeating. Second, we show how to guarantee proper-calibeating and proper-multicalibeating. Finally, we demonstrate the equivalence between proper-calibration and universal no regret when best replying to forecasts in decision-making under uncertainty.
EviTrack: Selection over Sampling for Delayed Disambiguation
Sequential prediction is challenging in regimes of delayed disambiguation, where early observations are ambiguous and multiple latent explanations remain plausible until sufficient evidence accumulates. Standard approaches based on marginal inference struggle in this setting, either collapsing uncertainty prematurely or failing to recover once informative evidence arrives. We introduce EviTrack, a test-time inference framework that operates over latent trajectories rather than marginal states. EviTrack maintains a set of competing trajectory hypotheses and applies evidence- and likelihood-ratio-based selection to delay commitment until supported by data, drawing inspiration from hypothesis management in multiple hypothesis tracking and track-before-detect. To evaluate this setting, we construct a controlled synthetic benchmark with known latent ground truth that explicitly exhibits delayed disambiguation. At matched inference budget, EviTrack substantially outperforms sampling-based baselines, achieving faster post-disambiguation recovery. These results show that, in delayed disambiguation regimes, moderate trajectory-level selection is more effective than increasing sampling coverage, highlighting selection over sampling as a key principle for reliable sequential inference.
Harnessing Unimodality in Semiparametric Contextual Pricing via Oracle Price Map Learning
Fan, Yingying, Han, Yuxuan, Lv, Jinchi, Xu, Xiaocong, Zhou, Zhengyuan
We study contextual dynamic pricing in a semiparametric scalar-index valuation model where the latent value is $v_t=μ_\ast(\mathsf c_t)+ξ_t$, with an unknown utility map $μ_\ast$ and an unknown additive noise distribution. The key decision object is the one-dimensional oracle price map $u\mapsto p^\ast(u)$ induced by the scalar index $u=μ_\ast(\mathsf c)$ and the noise tail. Under the $β$-Hölder smoothness of the tail function for $β\geq 2$ and a revenue-geometry condition that gives a unique, stable, interior maximizer, this oracle map is itself $(β-1)$-smooth. We exploit such structure through $\mathsf{ORBIT}$, a modular coarse-to-fine policy that takes a scalar pilot index as input, localizes a benchmark price in each active bin, and learns a local polynomial approximation of the oracle map inside a trust region via bandit convex optimization. For the baseline linear utility model $μ_\ast(\mathsf c)=\mathsf c^\topθ_\ast$, an adaptive elliptical exploration scheme constructs the required scalar pilot online without distributional assumptions on the contexts. The resulting policy achieves regret $\widetilde{O}\big(T^{\frac{2β-1}{4β-3}}+\sqrt{dT}\big)$. For fixed $d$, we establish a matching lower bound in the horizon dependence, unveiling that the nonparametric oracle-map learning term is minimax sharp. The same scalar-pilot interface also yields extensions to sparse high-dimensional linear utility and nonparametric Hölder utility.
SurvivalPFN: Amortizing Survival Prediction via In-Context Bayesian Inference
Qi, Shi-ang, Balazadeh, Vahid, Cooper, Michael, Greiner, Russell, Krishnan, Rahul G.
Survival analysis provides a powerful statistical framework for modeling time-to-event outcomes in the presence of censoring. However, selecting an appropriate estimator from the many specialized survival approaches often requires substantial methodological and domain expertise. We introduce SurvivalPFN, a prior-data fitted network that amortizes Bayesian inference for censored observations through in-context learning. SurvivalPFN is pretrained on a diverse family of synthetic, identifiable, and right-censored data-generating processes, enabling it to amortize survival analysis in a single forward pass during inference. As a result, the model adapts to the effective complexity of each dataset without task-specific training or hyperparameter tuning, avoids restrictive parametric assumptions, and produces calibrated survival distributions. In a large-scale benchmark spanning 61 datasets, 21 methods, and 5 evaluation metrics, SurvivalPFN achieves strong predictive performance and often improves upon established survival models. These results suggest that SurvivalPFN offers a principled and practical foundation model for survival analysis, with potential applications in high-impact domains such as healthcare, finance, and engineering (https://github.com/rgklab/SurvivalPFN).
Optimal Regret for Single Index Bandits
Dey, Devdan, Bhore, Sujoy, Ghosh, Avishek
We study the $\textit{single-index bandit}$ problem, where rewards depend on an unknown one-dimensional projection of high-dimensional contexts through an unknown reward function. This model extends linear and generalized linear bandits to a nonparametric setting, and is particularly relevant when the reward function is not known in advance. While optimal regret guarantees are known for monotone reward functions, the general non-monotone case remains poorly understood, with the best known bound being $\tilde{\mathcal{O}}(T^{3/4})$ (under standard boundedness and Lipschitz assumptions on the reward function [Kang et al., 2025]). We close this gap by establishing the optimal regret for general single-index bandits. We propose a simple two-phase algorithm, namely, Zoomed Single Index Bandit with Upper Confidence Bound ($\texttt{ZoomSIB-UCB}$), that first estimates the projection direction via a normalized Stein estimator, and then reduces the problem to a one-dimensional bandit using discretization and finally use UCB. This approach achieves a regret of $\tilde{\mathcal{O}}(T^{2/3})$, and improves significantly upon prior work without any additional assumptions. We also prove a matching minimax lower bound of $\tildeΩ(T^{2/3})$, showing that the upper bound is essentially tight. Our upper and lower bounds together provide a sharp characterization of the regret in single-index bandits. Moreover, the empirical results further demonstrate the effectiveness and robustness of our approach.
Dynamic Treatment on Networks
Nar, Bengusu, Li, Jiguang, Ročková, Veronika, Toulis, Panos
In networks, effective dynamic treatment allocation requires deciding both whom to treat and also when, so as to amplify policy impact through spillovers. An early intervention at a well-connected node can trigger cascades that change which nodes are worth targeting in the next period. Existing treatment strategies under network interference are largely static while dynamic treatment frameworks typically ignore network structure altogether. We integrate these perspectives and propose Q-Ising, a three-stage pipeline that (i) estimates network adoption dynamics via a Bayesian dynamic Ising model from a single observed panel, (ii) augments treatment adoption histories with continuous posterior latent states, and (iii) learns a dynamic policy via offline reinforcement learning. The Bayesian mechanism enables uncertainty quantification over dynamic decisions, yielding posterior ensemble policies with interpretable spillover estimates. We provide a finite-sample regret upper bound that decomposes into standard offline-RL uncertainty, network abstraction error, and first stage error in Ising state estimation. We apply our method to data from Indian village microfinance networks and synthetic stochastic block models under simulated heterogeneous susceptible-infected-susceptible (SIS) dynamics and demonstrate that adaptive targeting outperforms static centrality benchmarks.
8 Supplementary Material
Calculation of T Given data D, disaggregate Y into M equal-size bins, and the m-th bin is denoted as Bm. Let m = |Bm| denote the number of samples in Bm. For distribution p 2 (V A Y) conditioned on y in Bm, pV,A|ym, pV|ym and pA|ym are denoted as the joint distribution of (V,A), marginal distribution of V and A, respectively. As detailed in Section 5.1 of [33] and Algorithm 4 of [32], Um could be calculated through U-statistic. Specifically, in [33], they consider designing kernel as ij(av)= I(Ai = a,Vi = v) I(Ai = a)I(Vi = v), for i and j-th sample in Dt.
Equal Opportunity of Coverage in Fair Regression
We study fair machine learning (ML) under predictive uncertainty to enable reliable and trustworthy decision-making. The seminal work of "equalized coverage" proposed an uncertainty-aware fairness notion. However, it does not guarantee equal coverage rates across more fine-grained groups (e.g., low-income females) conditioning on the true label and is biased in the assessment of uncertainty. To tackle these limitations, we propose a new uncertainty-aware fairness - Equal Opportunity of Coverage (EOC) - that aims to achieve two properties: (1) coverage rates for different groups with similar outcomes are close, and (2) the coverage rate for the entire population remains at a predetermined level. Further, the prediction intervals should be narrow to be informative. We propose Binned Fair Quantile Regression (BFQR), a distribution-free post-processing method to improve EOC with reasonable width for any trained ML models. It first calibrates a hold-out set to bound deviation from EOC, then leverages conformal prediction to maintain EOC on a test set, meanwhile optimizing prediction interval width. Experimental results demonstrate the effectiveness of our method in improving EOC.
Assumptions and Likelihoods in More Detail
A.1 Notation Let T be a failure time with CDFF. T's survival function is defined by F = 1 F. We denote failure models by FθT. Let C be a censoring time with CDFG, survival function G, and model GθC. Under right-censoring, define U = min(T,C), = 1 [T C] and we observe (Xi,Ui, i). We use G(t) to denote P(C t).