Imbalance Class Classification using Random Forest
I agree with the idea of using boosting algorithms is better but not enough in practice. SMOTE would be a good starting point (definitely I would opt for a over-sampling strategy) but there are others. Here you can find a nice implementation of solutions for imbalanced data in python (scikit-learn-contrib). The success of any of these techniques depend largely on the nature of your data. Therefore, I would suggest you try different approaches and see how they affect your results.
Feb-27-2018, 12:45:28 GMT
- Technology: