Bagging Provides Assumption-free Stability

Soloff, Jake A., Barber, Rina Foygel, Willett, Rebecca

Apr-23-2023–arXiv.org Artificial Intelligence

Algorithmic stability--that is, how perturbing training data influences a learned model--is fundamental to modern data analysis. In learning theory, certain forms of stability are necessary and sufficient for generalization (Bousquet and Elisseeff, 2002; Poggio et al., 2004; Shalev-Shwartz et al., 2010). In model selection, stability measures can reliably identify important features (Meinshausen and Bühlmann, 2010; Shah and Samworth, 2013; Ren et al., 2021). In scientific applications, stable methods promote reproducibility, a prerequisite for meaningful inference (Yu, 2013). In distribution-free prediction, stability is a key assumption for the validity of jackknife prediction intervals (Barber et al., 2021; Steinberger and Leeb, 2023). Anticipating various benefits of stability, Breiman (1996a,b) proposed bagging as an ensemble metaalgorithm to stabilize any base learning algorithm. Bagging, short for bootstrap aggregating, refits the base algorithm to many perturbations of the training data and averages the resulting predictions. Breiman's vision of bagging as off-the-shelf stabilizer motivates our main question: How stable is bagging on an arbitrary base algorithm, placing no assumptions on the data generating distribution? In this paper, we first answer this question for the case of base algorithms with bounded outputs and then show extensions to the unbounded case.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Apr-23-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Illinois > Cook County > Chicago (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report > New Finding (0.95)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found