A Comprehensive Survey on Imbalanced Data Learning
Gao, Xinyi, Xie, Dongting, Zhang, Yihang, Wang, Zhengren, He, Conghui, Yin, Hongzhi, Zhang, Wentao
–arXiv.org Artificial Intelligence
With the expansion of data availability, machine learning (ML) has achieved remarkable breakthroughs in both academia and industry. However, imbalanced data distributions are prevalent in various types of raw data and severely hinder the performance of ML by biasing the decision-making processes. To deepen the understanding of imbalanced data and facilitate the related research and applications, this survey systematically analyzing various real-world data formats and concludes existing researches for different data formats into four distinct categories: data re-balancing, feature representation, training strategy, and ensemble learning. This structured analysis help researchers comprehensively understand the pervasive nature of imbalance across diverse data format, thereby paving a clearer path toward achieving specific research goals. we provide an overview of relevant open-source libraries, spotlight current challenges, and offer novel insights aimed at fostering future advancements in this critical area of study.
arXiv.org Artificial Intelligence
Feb-12-2025
- Country:
- Oceania > Australia
- Queensland > Brisbane (0.04)
- North America
- United States > New York
- New York County > New York City (0.04)
- Canada > Quebec
- Capitale-Nationale Region
- Québec (0.04)
- Quebec City (0.04)
- Capitale-Nationale Region
- United States > New York
- Europe
- Italy > Sicily (0.04)
- Portugal > Braga
- Braga (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- France > Grand Est
- Bas-Rhin > Strasbourg (0.04)
- Asia
- Singapore (0.04)
- Bangladesh (0.04)
- Middle East
- Syria (0.04)
- Israel > Southern District
- Eilat (0.04)
- China
- Oceania > Australia
- Genre:
- Research Report (1.00)
- Overview (1.00)
- Industry:
- Education (1.00)
- Health & Medicine > Diagnostic Medicine (0.93)
- Media > News (0.67)
- Information Technology (0.67)
- Technology:
- Information Technology
- Sensing and Signal Processing > Image Processing (1.00)
- Information Management (1.00)
- Data Science > Data Mining (1.00)
- Communications (0.93)
- Artificial Intelligence
- Vision (1.00)
- Representation & Reasoning (1.00)
- Natural Language (1.00)
- Machine Learning
- Statistical Learning (1.00)
- Performance Analysis > Accuracy (1.00)
- Neural Networks > Deep Learning (1.00)
- Inductive Learning (1.00)
- Evolutionary Systems (1.00)
- Ensemble Learning (0.92)
- Information Technology