Bagging Provides Assumption-free Stability
Soloff, Jake A., Barber, Rina Foygel, Willett, Rebecca
–arXiv.org Artificial Intelligence
Algorithmic stability--that is, how perturbing training data influences a learned model--is fundamental to modern data analysis. In learning theory, certain forms of stability are necessary and sufficient for generalization (Bousquet and Elisseeff, 2002; Poggio et al., 2004; Shalev-Shwartz et al., 2010). In model selection, stability measures can reliably identify important features (Meinshausen and Bühlmann, 2010; Shah and Samworth, 2013; Ren et al., 2021). In scientific applications, stable methods promote reproducibility, a prerequisite for meaningful inference (Yu, 2013). In distribution-free prediction, stability is a key assumption for the validity of jackknife prediction intervals (Barber et al., 2021; Steinberger and Leeb, 2023). Anticipating various benefits of stability, Breiman (1996a,b) proposed bagging as an ensemble metaalgorithm to stabilize any base learning algorithm. Bagging, short for bootstrap aggregating, refits the base algorithm to many perturbations of the training data and averages the resulting predictions. Breiman's vision of bagging as off-the-shelf stabilizer motivates our main question: How stable is bagging on an arbitrary base algorithm, placing no assumptions on the data generating distribution? In this paper, we first answer this question for the case of base algorithms with bounded outputs and then show extensions to the unbounded case.
arXiv.org Artificial Intelligence
Apr-23-2023
- Country:
- North America > United States
- Illinois > Cook County > Chicago (0.04)
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (0.95)
- Technology: