[P] Is XGBoost w/ iterating undersampling doable? • r/MachineLearning
I know this might sound like a "google this for me question" but bare with me (I googled it). I'm working with a highly imbalanced data set where the minority class accounts for 1.5% of the total set. This leads to poor predictive performance by most models when nothing is done to address the problem because most algorithms will minimize cost on the majority class, to the detriment of the minority class, when training so as to decrease overall cost. So far I've tried out ANNs,RFs,XGBs, and SVMs and have found that XGB and RF outperform the others in this particular problem, so the remaining post will be about RF and XGB. I've tried penalizing classification on the minority class much more than the majority class to try to fix the imbalance on an algorithmic level but I've found undersampling and then training on the resulting data set to be more effective in my case.
Aug-4-2017, 19:05:08 GMT
- Technology: