Optimizing Classifers for Imbalanced Training Sets
Karakoulas, Grigoris I., Shawe-Taylor, John
–Neural Information Processing Systems
Following recent results [9, 8] showing the importance of the fatshattering dimension in explaining the beneficial effect of a large margin on generalization performance, the current paper investigates the implications of these results for the case of imbalanced datasets and develops two approaches to setting the threshold. The approaches are incorporated into ThetaBoost, a boosting algorithm for dealing with unequal loss functions. The performance of ThetaBoost and the two approaches are tested experimentally.
Neural Information Processing Systems
Dec-31-1999