AITopics | class-imbalanced learning

e025b6279c1b88d3ec0eca6fcb6e6280-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 19:10:48 GMT

classifier, learning, unlabeled data, (13 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Rethinking the Value of Labels for Improving Class-Imbalanced Learning

Neural Information Processing SystemsDec-24-2025, 18:51:20 GMT

Real-world data often exhibits long-tailed distributions with heavy class imbalance, posing great challenges for deep recognition models. We identify a persisting dilemma on the value of labels in the context of imbalanced learning: on the one hand, supervision from labels typically leads to better results than its unsupervised counterparts; on the other hand, heavily imbalanced data naturally incurs ''label bias'' in the classifier, where the decision boundary can be drastically altered by the majority classes. In this work, we systematically investigate these two facets of labels. We demonstrate, theoretically and empirically, that class-imbalanced learning can significantly benefit in both semi-supervised and self-supervised manners. Specifically, we confirm that (1) positively, imbalanced labels are valuable: given more unlabeled data, the original labels can be leveraged with the extra data to reduce label bias in a semi-supervised manner, which greatly improves the final classifier; (2) negatively however, we argue that imbalanced labels are not useful always: classifiers that are first pre-trained in a self-supervised manner consistently outperform their corresponding baselines. Extensive experiments on large-scale imbalanced datasets verify our theoretically grounded strategies, showing superior performance over previous state-of-the-arts. Our intriguing findings highlight the need to rethink the usage of imbalanced labels in realistic long-tailed tasks. Code is available at https://github.com/YyzHarry/imbalanced-semi-self.

class-imbalanced learning, name change, rethinking, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

CLIMB: Class-imbalanced Learning Benchmark on Tabular Data

Liu, Zhining, Li, Zihao, Yang, Ze, Wei, Tianxin, Kang, Jian, Zhu, Yada, Hamann, Hendrik, He, Jingrui, Tong, Hanghang

arXiv.org Artificial IntelligenceOct-21-2025

Class-imbalanced learning (CIL) on tabular data is important in many real-world applications where the minority class holds the critical but rare outcomes. In this paper, we present CLIMB, a comprehensive benchmark for class-imbalanced learning on tabular data. CLIMB includes 73 real-world datasets across diverse domains and imbalance levels, along with unified implementations of 29 representative CIL algorithms. Built on a high-quality open-source Python package with unified API designs, detailed documentation, and rigorous code quality controls, CLIMB supports easy implementation and comparison between different CIL algorithms. Through extensive experiments, we provide practical insights on method accuracy and efficiency, highlighting the limitations of naive rebalancing, the effectiveness of ensembles, and the importance of data quality. Our code, documentation, and examples are available at https://github.com/ZhiningLiu1998/imbalanced-ensemble.

data mining, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2505.17451

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Banking & Finance (1.00)
Health & Medicine > Diagnostic Medicine (0.92)
(2 more...)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.92)
(5 more...)

Add feedback

e025b6279c1b88d3ec0eca6fcb6e6280-Paper.pdf

Neural Information Processing SystemsOct-9-2025, 15:34:10 GMT

artificial intelligence, machine learning, unlabeled data, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Review for NeurIPS paper: Rethinking the Value of Labels for Improving Class-Imbalanced Learning

Neural Information Processing SystemsFeb-7-2025, 06:19:59 GMT

The problem is related to commonly existing long-tail issues in many machine learning tasks. The paper provides insightful comments on the effect of available labels in class-imbalanced learning from two different aspects. The results could be of interest to even broader area of different applications. Different factors are considered, such as the class distribution (imbalanceness) and the relevance between training and testing data. Their effects on the learnability and estimation accuracy are both analyzed.

class-imbalanced learning, neurips paper, rethinking, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Review for NeurIPS paper: Rethinking the Value of Labels for Improving Class-Imbalanced Learning

Neural Information Processing SystemsFeb-7-2025, 06:19:52 GMT

Based on theoretical observations regarding unlabeled data in this setting, a pseudo-labeling strategy is proposed for training and pre-training, analyzed, and thoroughly evaluated. Even after rebuttal and discussion, there remained some remaining suggestions around additional citations, etc. (as this is a well-established area), but these were not crucial in my opinion. However, the analysis and empirical findings were considered important by all the reviewers (especially when including the appendices) and there was unanimous support for accepting.

class-imbalanced learning, neurips paper, rethinking

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Rethinking the Value of Labels for Improving Class-Imbalanced Learning

Neural Information Processing SystemsOct-11-2024, 14:03:01 GMT

Real-world data often exhibits long-tailed distributions with heavy class imbalance, posing great challenges for deep recognition models. We identify a persisting dilemma on the value of labels in the context of imbalanced learning: on the one hand, supervision from labels typically leads to better results than its unsupervised counterparts; on the other hand, heavily imbalanced data naturally incurs ''label bias'' in the classifier, where the decision boundary can be drastically altered by the majority classes. In this work, we systematically investigate these two facets of labels. We demonstrate, theoretically and empirically, that class-imbalanced learning can significantly benefit in both semi-supervised and self-supervised manners. Specifically, we confirm that (1) positively, imbalanced labels are valuable: given more unlabeled data, the original labels can be leveraged with the extra data to reduce label bias in a semi-supervised manner, which greatly improves the final classifier; (2) negatively however, we argue that imbalanced labels are not useful always: classifiers that are first pre-trained in a self-supervised manner consistently outperform their corresponding baselines.

class-imbalanced learning, classifier, imbalanced label, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Class-Imbalanced Learning on Graphs: A Survey

Ma, Yihong, Tian, Yijun, Moniz, Nuno, Chawla, Nitesh V.

arXiv.org Artificial IntelligenceApr-9-2023

In recent years, graph representation learning techniques have proven effective in discovering meaningful vector representations of nodes, edges, or entire graphs, resulting in successful applications across a wide range of downstream tasks [29, 52, 68]. However, graph data often presents a significant challenge in the form of class imbalance, where one class's instances significantly outnumber those of other classes. This imbalance can lead to suboptimal performance when applying machine learning techniques to graph data. Class-imbalanced learning on graphs (CILG) is an emerging research area addressing class imbalance in graph data, where traditional methods for non-graph data might be unsuitable or ineffective for several reasons. Firstly, graph data's unique, irregular, non-Euclidean structure complicates traditional class-imbalance techniques designed for Euclidean data [78]. Secondly, graph data often holds rich relational information, necessitating specialized techniques for preservation and leverage during the learning process [51]. Lastly, node dependencies and interactions in a graph make class re-balancing complex, as naïve oversampling or undersampling may disrupt the graph's structure and thus lead to poor performance [35].

artificial intelligence, machine learning, node, (14 more...)

arXiv.org Artificial Intelligence

2304.043

Country: