Logistic regression on large imbalance datasets
Hello, I am working on a highly imbalanced dataset (negative examples over 20K and positive examples about 100). I am trying to build a logistic regression model. My current approach includes undersampling of negative examples. However with this approach there are a couple of problems: 1) Several LR models are possible with different samples. How to generalize these models and interpret the output?
May-12-2017, 04:10:02 GMT