Bagging Provides Assumption-free Stability

Soloff, Jake A., Barber, Rina Foygel, Willett, Rebecca

arXiv.org Artificial Intelligence 

Algorithmic stability--that is, how perturbing training data influences a learned model--is fundamental to modern data analysis. In learning theory, certain forms of stability are necessary and sufficient for generalization (Bousquet and Elisseeff, 2002; Poggio et al., 2004; Shalev-Shwartz et al., 2010). In model selection, stability measures can reliably identify important features (Meinshausen and Bühlmann, 2010; Shah and Samworth, 2013; Ren et al., 2021). In scientific applications, stable methods promote reproducibility, a prerequisite for meaningful inference (Yu, 2013). In distribution-free prediction, stability is a key assumption for the validity of jackknife prediction intervals (Barber et al., 2021; Steinberger and Leeb, 2023). Anticipating various benefits of stability, Breiman (1996a,b) proposed bagging as an ensemble metaalgorithm to stabilize any base learning algorithm. Bagging, short for bootstrap aggregating, refits the base algorithm to many perturbations of the training data and averages the resulting predictions. Breiman's vision of bagging as off-the-shelf stabilizer motivates our main question: How stable is bagging on an arbitrary base algorithm, placing no assumptions on the data generating distribution? In this paper, we first answer this question for the case of base algorithms with bounded outputs and then show extensions to the unbounded case.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found