2D Stability Selection: Design Jittering for Doubly Stable Feature Selection

Nouraie, Mahdi, Zhu, Houying, Muller, Samuel

arXiv.org Machine Learning 

We study feature selection in high-dimensional regression under two distinct sources of instability: sampling variability and measurement error in the design matrix. Stability Selection addresses the former through sub-sampling and aggregation, but does not explicitly stress-test robustness to noisy predictors. We introduce doubly stable feature selection, a perturb-and-aggregate framework that targets features whose inclusion is stable both across randomization and across increasing levels of design noise. The method injects controlled additive noise into the design matrix, fits a fixed base selector such as the Lasso on the perturbed data, and aggregates selection frequencies. Sweeping over a grid of noise levels yields a stability path that summarizes robustness to measurement error while using the full sample size and isolating the effect of design perturbations. On the theory side, we show that classical model-selection conditions are preserved under sufficiently small perturbations, with a high-probability extension for Gaussian noise. Empirically, experiments on synthetic and real datasets show improved robustness compared with Stability Selection and standard base selectors.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found