Dealing with imbalanced data: undersampling, oversampling and proper cross-validation