Handling Imbalanced data when building regression models
This is a good question, and one that seems to get raised time and time again. Myself and a colleague (Sven Crone from Lancaster University in the UK) published a paper on this issue last year in the International Journal of Forecasting. A summary of our findings can also be found in the book "Credit Scoring, Response Modeling and Insurance Rating. There are also some very good papers by G. Weiss from 2004/5 which are highly cited and referenced in our paper/book. What we found was that for some methods of model construction sample imbalance was not an issue at all – not even a tiny amount.
Apr-22-2016, 03:05:25 GMT