test point
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.72)
We thank all the reviewers for excellent questions and many relevant remarks
We thank all the reviewers for excellent questions and many relevant remarks. Thank you for this remark. One of the reason for this is that our method produces interpretations directly in terms of the input features. Thank you for pointing this out, we agree that faithful is not best. This is not the case for local models such as LIME.
Test-Time Collective Prediction
An increasingly common setting in machine learning involves multiple parties, each with their own data, who want to jointly make predictions on future test points. Agents wish to benefit from the collective expertise of the full set of agents to make better predictions than they would individually, but may not be willing to release labeled data or model parameters.
- North America > United States > California > Alameda County > Berkeley (0.04)
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy (0.04)
- (2 more...)
Online Partitioned Local Depth for semi-supervised applications
Foley, John D., Lee, Justin T.
We introduce an extension of the partitioned local depth (PaLD) algorithm that is adapted to online applications such as semi-supervised prediction. The new algorithm we present, online PaLD, is well-suited to situations where it is a possible to pre-compute a cohesion network from a reference dataset. After $O(n^3)$ steps to construct a queryable data structure, online PaLD can extend the cohesion network to a new data point in $O(n^2)$ time. Our approach complements previous speed up approaches based on approximation and parallelism. For illustrations, we present applications to online anomaly detection and semi-supervised classification for health-care datasets.
Weighted Conformal Prediction Provides Adaptive and Valid Mask-Conditional Coverage for General Missing Data Mechanisms
Fan, Jiarong, Vo, Juhyun Park. Thi Phuong Thuy, Brunel, Nicolas
Conformal prediction (CP) offers a principled framework for uncertainty quantification, but it fails to guarantee coverage when faced with missing covariates. In addressing the heterogeneity induced by various missing patterns, Mask-Conditional Valid (MCV) Coverage has emerged as a more desirable property than Marginal Coverage. In this work, we adapt split CP to handle missing values by proposing a preimpute-mask-then-correct framework that can offer valid coverage. We show that our method provides guaranteed Marginal Coverage and Mask-Conditional Validity for general missing data mechanisms. A key component of our approach is a reweighted conformal prediction procedure that corrects the prediction sets after distributional imputation (multiple imputation) of the calibration dataset, making our method compatible with standard imputation pipelines. We derive two algorithms, and we show that they are approximately marginally valid and MCV. We evaluate them on synthetic and real-world datasets. It reduces significantly the width of prediction intervals w.r.t standard MCV methods, while maintaining the target guarantees.
- North America > United States (0.14)
- Europe > France > Île-de-France > Paris > Paris (0.14)
- Europe > Finland > Uusimaa > Helsinki (0.04)
Provable Recovery of Locally Important Signed Features and Interactions from Random Forest
Vuk, Kata, Ihlo, Nicolas Alexander, Behr, Merle
Feature and Interaction Importance (FII) methods are essential in supervised learning for assessing the relevance of input variables and their interactions in complex prediction models. In many domains, such as personalized medicine, local interpretations for individual predictions are often required, rather than global scores summarizing overall feature importance. Random Forests (RFs) are widely used in these settings, and existing interpretability methods typically exploit tree structures and split statistics to provide model-specific insights. However, theoretical understanding of local FII methods for RF remains limited, making it unclear how to interpret high importance scores for individual predictions. We propose a novel, local, model-specific FII method that identifies frequent co-occurrences of features along decision paths, combining global patterns with those observed on paths specific to a given test point. We prove that our method consistently recovers the true local signal features and their interactions under a Locally Spike Sparse (LSS) model and also identifies whether large or small feature values drive a prediction. We illustrate the usefulness of our method and theoretical results through simulation studies and a real-world data example.
- Europe > Germany > Bavaria > Regensburg (0.04)
- North America > United States > New York (0.04)
- North America > United States > Florida > Broward County (0.04)
Specialization after Generalization: Towards Understanding Test-Time Training in Foundation Models
Hübotter, Jonas, Wolf, Patrik, Shevchenko, Alexander, Jüni, Dennis, Krause, Andreas, Kur, Gil
Many standard TTT methods train on carefully selected data from the pre-training dataset (i.e., do not add any new privileged information; Hardt & Sun, 2024; Hübotter et al., 2025), and several works studied how to optimally select data for imitation, e.g., the early seminal work of MacKay (1992) and recent extensions (Hübotter et al., 2024; Bagatella et al., 2025b). TTT has also been extended from supervised learning to reinforcement learning (Zuo et al., 2025; Bagatella et al., 2025a; Diaz-Bone et al., 2025). So far it has not been well understood why and when TTT is effective. While many different methods have been proposed for TTT, we focus here on analyzing "semi-parametric" TTT (e.g., Hardt & Sun, 2024; Hübotter et al., 2025), where a pre-trained model is fine-tuned with a supervised loss on a small neighborhood of the test point in the training data. This is different from some other methods for test-time "adaptation", which are commonly applied with distribution shifts (e.g., Wang et al., 2021; Zhang et al., 2022; Durasov et al., 2025). Basu et al. (2023) consider a similar setting to ours, but analyze it through the lens of non-parametric estimation, relying on the smoothness of the target function in the feature space Ψ.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- (2 more...)
Supervised learning pays attention
Craig, Erin, Tibshirani, Robert
In-context learning with attention enables large neural networks to make context-specific predictions by selectively focusing on relevant examples. Here, we adapt this idea to supervised learning procedures such as lasso regression and gradient boosting, for tabular data. Our goals are to (1) flexibly fit personalized models for each prediction point and (2) retain model simplicity and interpretability. Our method fits a local model for each test observation by weighting the training data according to attention, a supervised similarity measure that emphasizes features and interactions that are predictive of the outcome. Attention weighting allows the method to adapt to heterogeneous data in a data-driven way, without requiring cluster or similarity pre-specification. Further, our approach is uniquely interpretable: for each test observation, we identify which features are most predictive and which training observations are most relevant. We then show how to use attention weighting for time series and spatial data, and we present a method for adapting pretrained tree-based models to distributional shift using attention-weighted residual corrections. Across real and simulated datasets, attention weighting improves predictive performance while preserving interpretability, and theory shows that attention-weighting linear models attain lower mean squared error than the standard linear model under mixture-of-models data-generating processes with known subgroup structure.
- North America > United States > Michigan (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
SetAD: Semi-Supervised Anomaly Learning in Contextual Sets
Gao, Jianling, Tao, Chongyang, Lin, Xuelian, Liu, Junfeng, Ma, Shuai
Semi-supervised anomaly detection (AD) has shown great promise by effectively leveraging limited labeled data. However, existing methods are typically structured around scoring individual points or simple pairs. Such {point- or pair-centric} view not only overlooks the contextual nature of anomalies, which are defined by their deviation from a collective group, but also fails to exploit the rich supervisory signals that can be generated from the combinatorial composition of sets. Consequently, such models struggle to exploit the high-order interactions within the data, which are critical for learning discriminative representations. To address these limitations, we propose SetAD, a novel framework that reframes semi-supervised AD as a Set-level Anomaly Detection task. SetAD employs an attention-based set encoder trained via a graded learning objective, where the model learns to quantify the degree of anomalousness within an entire set. This approach directly models the complex group-level interactions that define anomalies. Furthermore, to enhance robustness and score calibration, we propose a context-calibrated anomaly scoring mechanism, which assesses a point's anomaly score by aggregating its normalized deviations from peer behavior across multiple, diverse contextual sets. Extensive experiments on 10 real-world datasets demonstrate that SetAD significantly outperforms state-of-the-art models. Notably, we show that our model's performance consistently improves with increasing set size, providing strong empirical support for the set-based formulation of anomaly detection.
- North America > United States > District of Columbia > Washington (0.05)
- Asia > China > Beijing > Beijing (0.04)
- Oceania > Australia > Western Australia > Perth (0.04)
- (2 more...)
- Health & Medicine (0.68)
- Information Technology > Security & Privacy (0.67)