Do We Need Balanced Sampling?

@machinelearnbot 

In many real-world classification tasks such as churn prediction and fraud detection, we often encounter the class imbalance problem, which means one class is significantly outnumbered by the other class. The class imbalance problem brings great challenges to standard classification learning algorithms. Most of them tend to misclassify the minority instances more often than the majority instances on imbalanced data sets. For example, when a model is trained on a data set with 1% of instances from the minority class, a 99% accuracy rate can be achieved simply by classifying all instances as belonging to the majority class. Indeed, the problem of learning on imbalanced data sets is considered to be one of the ten challenging problems in data mining research.