rboost
Improved Replicable Boosting with Majority-of-Majorities
Larsen, Kasper Green, Mathiasen, Markus Engelund, Svendsen, Clement
Replicability of an algorithm is a property introduced as a reaction to what is called the reproducibility crisis. Multiple Nature articles have pointed out the issue of researchers not being able to replicate findings [Baker, 2016, Ball, 2023]. As a supplement to implementing better research practices in order to ensure replicability, Impagliazzo et al. [2022] introduced the concept of replicability as a property of algorithms themselves. Informally, an algorithm is replicable if it, with high probability, outputs the same result when run with different input data drawn from the same distribution.
Regularized boosting with an increasing coefficient magnitude stop criterion as meta-learner in hyperparameter optimization stacking ensemble
Fdez-Díaz, Laura, Quevedo, José Ramón, Montañés, Elena
In Hyperparameter Optimization (HPO), only the hyperparameter configuration with the best performance is chosen after performing several trials, then, discarding the effort of training all the models with every hyperparameter configuration trial and performing an ensemble of all them. This ensemble consists of simply averaging the model predictions or weighting the models by a certain probability. Recently, other more sophisticated ensemble strategies, such as the Caruana method or the stacking strategy has been proposed. On the one hand, the Caruana method performs well in HPO ensemble, since it is not affected by the effects of multicollinearity, which is prevalent in HPO. It just computes the average over a subset of predictions with replacement. But it does not benefit from the generalization power of a learning process. On the other hand, stacking methods include a learning procedure since a meta-learner is required to perform the ensemble. Yet, one hardly finds advice about which meta-learner is adequate. Besides, some meta-learners may suffer from the effects of multicollinearity or need to be tuned to reduce them. This paper explores meta-learners for stacking ensemble in HPO, free of hyperparameter tuning, able to reduce the effects of multicollinearity and considering the ensemble learning process generalization power. At this respect, the boosting strategy seems promising as a stacking meta-learner. In fact, it completely removes the effects of multicollinearity. This paper also proposes an implicit regularization in the classical boosting method and a novel non-parametric stop criterion suitable only for boosting and specifically designed for HPO. The synergy between these two improvements over boosting exhibits competitive and promising predictive power performance compared to other existing meta-learners and ensemble approaches for HPO other than the stacking ensemble.
Boosting in the presence of label noise
Bootkrajang, Jakramate, Kaban, Ata
Boosting is known to be sensitive to label noise. We studied two approaches to improve AdaBoost's robustness against labelling errors. One is to employ a label-noise robust classifier as a base learner, while the other is to modify the AdaBoost algorithm to be more robust. Empirical evaluation shows that a committee of robust classifiers, although converges faster than non label-noise aware AdaBoost, is still susceptible to label noise. However, pairing it with the new robust Boosting algorithm we propose here results in a more resilient algorithm under mislabelling.