AdaCC: Cumulative Cost-Sensitive Boosting for Imbalanced Classification
Iosifidis, Vasileios, Papadopoulos, Symeon, Rosenhahn, Bodo, Ntoutsi, Eirini
–arXiv.org Artificial Intelligence
Class imbalance poses a major challenge for machine learning as most supervised learning models might exhibit bias towards the majority class and under-perform in the minority class. Cost-sensitive learning tackles this problem by treating the classes differently, formulated typically via a user-defined fixed misclassification cost matrix provided as input to the learner. Such parameter tuning is a challenging task that requires domain knowledge and moreover, wrong adjustments might lead to overall predictive performance deterioration. In this work, we propose a novel cost-sensitive boosting approach for imbalanced data that dynamically adjusts the misclassification costs over the boosting rounds in response to model's performance instead of using a fixed misclassification cost matrix. Our method, called AdaCC, is parameter-free as it relies on the cumulative behavior of the boosting model in order to adjust the misclassification costs for the next boosting round and comes with theoretical guarantees regarding the training error. Experiments on 27 real-world datasets from different domains with high class imbalance demonstrate the superiority of our method over 12 state-of-the-art cost-sensitive boosting approaches exhibiting consistent improvements in different measures, for instance, in the range of [0.3%-28.56%] for AUC, [3.4%-21.4%] for balanced accuracy, [4.8%-45%] for gmean and [7.4%-85.5%] for recall.
arXiv.org Artificial Intelligence
Sep-17-2022
- Country:
- North America
- United States
- New York
- New York County > New York City (0.14)
- Richmond County > New York City (0.04)
- Queens County > New York City (0.04)
- Kings County > New York City (0.04)
- Bronx County > New York City (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- California
- San Francisco County > San Francisco (0.28)
- Los Angeles County > Los Angeles (0.14)
- San Diego County > San Diego (0.04)
- Monterey County > Pacific Grove (0.04)
- Santa Clara County
- New York
- Canada > Alberta
- United States
- Europe
- Germany (0.04)
- United Kingdom > Wales (0.04)
- Portugal (0.04)
- Sweden > Stockholm
- Stockholm (0.04)
- Greece > Central Macedonia
- Thessaloniki (0.04)
- Slovenia > Upper Carniola
- Municipality of Bled > Bled (0.04)
- North Macedonia > Skopje Statistical Region
- Skopje Municipality > Skopje (0.04)
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Asia > China
- Liaoning Province > Dalian (0.04)
- Beijing > Beijing (0.04)
- North America
- Genre:
- Research Report
- New Finding (0.92)
- Experimental Study (0.67)
- Research Report
- Technology: