Kaggle Ensembling Guide
Model ensembling is a very powerful technique to increase accuracy on a variety of ML tasks. In this article I will share my ensembling approaches for Kaggle Competitions. For the first part we look at creating ensembles from submission files. The second part will look at creating ensembles through stacked generalization/blending. I answer why ensembling reduces the generalization error. Finally I show different methods of ensembling, together with their results and code to try it out for yourself. This is how you win ML competitions: you take other peoples' work and ensemble them together." The most basic and convenient way to ensemble is to ensemble Kaggle submission CSV files. You only need the predictions on the test set for these methods -- no need to retrain a model. This makes it a quick way to ensemble already existing model predictions, ideal when teaming up. Let's see why model ensembling reduces error rate and why it works better to ensemble low-correlated model ...
Sep-27-2016, 18:32:45 GMT