proper balancing
Proper Balancing for Cross Validation
Here we plot the precision results of balancing, with under-sampling, only the train set of each CV fold before fitting the model on it and making predictions on the CV fold's test set: Here we plot the precision results of balancing, with over-sampling, only the train set of each CV fold before fitting the model on it and making predictions on the CV fold's test set: It is clear, that balancing so far did not help in getting good test results. However, this is out of scope for this article (:-)) and the goal of this article is achieved: To make the model produce, on each CV fold's test set, evaluation metric scores similar to those that it would produce on an unknown one, for the case that the train data are balanced.