Undersampling Algorithms for Imbalanced Classification
Taken from Improving Identification of Difficult Small Classes by Balancing Class Distribution. This technique can be implemented using the NeighbourhoodCleaningRule imbalanced-learn class. The number of neighbors used in the ENN and CNN steps can be specified via the n_neighbors argument that defaults to three. The threshold_cleaning controls whether or not the CNN is applied to a given class, which might be useful if there are multiple minority classes with similar sizes. This is kept at 0.5.
Jan-20-2020, 11:35:01 GMT