AITopics | imbalanced classification

Collaborating Authors

imbalanced classification

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

7412b6288499d68769430839b98ff1c1-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 19:58:03 GMT

class imbalance, classification, classifier, (16 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > Texas (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

Add feedback

Learning to Re-weight Examples with Optimal Transport for Imbalanced Classification

Neural Information Processing SystemsDec-24-2025, 21:54:44 GMT

Imbalanced data pose challenges for deep learning based classification models. One of the most widely-used approaches for tackling imbalanced data is re-weighting, where training samples are associated with different weights in the loss function. Most of existing re-weighting approaches treat the example weights as the learnable parameter and optimize the weights on the meta set, entailing expensive bilevel optimization. In this paper, we propose a novel re-weighting method based on optimal transport (OT) from a distributional point of view. Specifically, we view the training set as an imbalanced distribution over its samples, which is transported by OT to a balanced distribution obtained from the meta set. The weights of the training samples are the probability mass of the imbalanced distribution andlearned by minimizing the OT distance between the two distributions. Compared with existing methods, our proposed one disengages the dependence of the weight learning on the concerned classifier at each iteration. Experiments on image, text and point cloud datasets demonstrate that our proposed re-weighting method has excellent performance, achieving state-of-the-art results in many cases andproviding a promising tool for addressing the imbalanced classification issue.

imbalanced classification, optimal transport, re-weight example, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.59)

Add feedback

Imbalanced Classification through the Lens of Spurious Correlations

Hackstein, Jakob, Bender, Sidney

arXiv.org Artificial IntelligenceNov-3-2025

Class imbalance poses a fundamental challenge in machine learning, frequently leading to unreliable classification performance. While prior methods focus on data- or loss-reweighting schemes, we view imbalance as a data condition that amplifies Clever Hans (CH) effects by underspecification of minority classes. In a counterfactual explanations-based approach, we propose to leverage Explainable AI to jointly identify and eliminate CH effects emerging under imbalance. Our method achieves competitive classification performance on three datasets and demonstrates how CH effects emerge under imbalance, a perspective largely overlooked by existing approaches.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.2765

Country: Europe > Germany (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.56)

Add feedback

Bias-Corrected Data Synthesis for Imbalanced Learning

Lyu, Pengfei, Ma, Zhengchi, Zhang, Linjun, Zhang, Anru R.

arXiv.org Machine LearningOct-31-2025

Imbalanced data, where the positive samples represent only a small proportion compared to the negative samples, makes it challenging for classification problems to balance the false positive and false negative rates. A common approach to addressing the challenge involves generating synthetic data for the minority group and then training classification models with both observed and synthetic data. However, since the synthetic data depends on the observed data and fails to replicate the original data distribution accurately, prediction accuracy is reduced when the synthetic data is naively treated as the true data. In this paper, we address the bias introduced by synthetic data and provide consistent estimators for this bias by borrowing information from the majority group. We propose a bias correction procedure to mitigate the adverse effects of synthetic data, enhancing prediction accuracy while avoiding overfitting. This procedure is extended to broader scenarios with imbalanced data, such as imbalanced multi-task learning and causal inference. Theoretical properties, including bounds on bias estimation errors and improvements in prediction accuracy, are provided. Simulation results and data analysis on handwritten digit datasets demonstrate the effectiveness of our method.

artificial intelligence, machine learning, synthetic sample, (16 more...)

arXiv.org Machine Learning

2510.26046

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

7412b6288499d68769430839b98ff1c1-Paper-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 22:16:36 GMT

artificial intelligence, classifier, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > Texas (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

Add feedback

EIoU-EMC: A Novel Loss for Domain-specific Nested Entity Recognition

Zhang, Jian, Zhang, Tianqing, Li, Qi, Wang, Hongwei

arXiv.org Artificial IntelligenceApr-22-2025

In recent years, research has mainly focused on the general NER task. There still have some challenges with nested NER task in the specific domains. Specifically, the scenarios of low resource and class imbalance impede the wide application for biomedical and industrial domains. In this study, we design a novel loss EIoU-EMC, by enhancing the implement of Intersection over Union loss and Multiclass loss. Our proposed method specially leverages the information of entity boundary and entity classification, thereby enhancing the model's capacity to learn from a limited number of data samples. To validate the performance of this innovative method in enhancing NER task, we conducted experiments on three distinct biomedical NER datasets and one dataset constructed by ourselves from industrial complex equipment maintenance documents. Comparing to strong baselines, our method demonstrates the competitive performance across all datasets. During the experimental analysis, our proposed method exhibits significant advancements in entity boundary recognition and entity classification. Our code are available here.

artificial intelligence, information retrieval, natural language, (19 more...)

arXiv.org Artificial Intelligence

2504.14203

Country:

Europe (0.29)
Asia > China > Zhejiang Province (0.15)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.72)

Add feedback

A binary PSO based ensemble under-sampling model for rebalancing imbalanced training data

Li, Jinyan, Wu, Yaoyang, Fong, Simon, Tallón-Ballesteros, Antonio J., Yang, Xin-she, Mohammed, Sabah, Wu, Feng

arXiv.org Artificial IntelligenceJan-30-2025

Ensemble technique and under-sampling technique are both effective tools used for imbalanced dataset classification problems. In this paper, a novel ensemble method combining the advantages of both ensemble learning for biasing classifiers and a new under-sampling method is proposed. The under-sampling method is named Binary PSO instance selection; it gathers with ensemble classifiers to find the most suitable length and combination of the majority class samples to build a new dataset with minority class samples. The proposed method adopts multi-objective strategy, and contribution of this method is a notable improvement of the performances of imbalanced classification, and in the meantime guaranteeing a best integrity possible for the original dataset. We experimented the proposed method and compared its performance of processing imbalanced datasets with several other conventional basic ensemble methods. Experiment is also conducted on these imbalanced datasets using an improved version where ensemble classifiers are wrapped in the Binary PSO instance selection. According to experimental results, our proposed methods outperform single ensemble methods, state-of-the-art under-sampling methods, and also combinations of these methods with the traditional PSO instance selection algorithm.

artificial intelligence, evolutionary algorithm, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2502.01655

Country:

Asia > Macao (0.04)
Oceania > Australia (0.04)
North America > United States > District of Columbia > Washington (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
(4 more...)

Add feedback

Learning to Re-weight Examples with Optimal Transport for Imbalanced Classification

Neural Information Processing SystemsJan-18-2025, 08:27:51 GMT

imbalanced classification, optimal transport, re-weight example, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

Optimal Downsampling for Imbalanced Classification with Generalized Linear Models

Chen, Yan, Blanchet, Jose, Dembczynski, Krzysztof, Nern, Laura Fee, Flores, Aaron

arXiv.org Machine LearningOct-11-2024

Downsampling or under-sampling is a technique that is utilized in the context of large and highly imbalanced classification models. We study optimal downsampling for imbalanced classification using generalized linear models (GLMs). We propose a pseudo maximum likelihood estimator and study its asymptotic normality in the context of increasingly imbalanced populations relative to an increasingly large sample size. We provide theoretical guarantees for the introduced estimator. Additionally, we compute the optimal downsampling rate using a criterion that balances statistical accuracy and computational efficiency. Our numerical experiments, conducted on both synthetic and empirical data, further validate our theoretical results, and demonstrate that the introduced estimator outperforms commonly available alternatives.

estimator, imbalanced classification, maximum likelihood estimator, (10 more...)

arXiv.org Machine Learning

2410.08994

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.93)
Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

When resampling/reweighting improves feature learning in imbalanced classification?: A toy-model study

Obuchi, Tomoyuki, Tanaka, Toshiyuki

arXiv.org Machine LearningSep-9-2024

Classifiers applied to such datasets tend to perform poorly for minority classes, which poses a major challenge in areas such as visual recognition. Although several methods to mitigate class imbalance have been proposed so far [6, 7, 8], recent advances of deep learning have shed new light on this issue, resulting in numerous studies from the perspective of applying those approaches to classifiers based on deep neural networks (DNNs) [5, 9, 10, 11, 12, 13, 1, 2, 14, 15, 16, 17]. Among those approaches proposed so far, we focus on two simple strategies, reweighting and resampling, which are commonly employed to mitigate class imbalance. The resampling strategy tries to balance the samples in the dataset by oversampling the minority classes and/or undersampling the majority classes, while the reweighting strategy puts an additional weight to each term of the loss in order to counterweight the class imbalance. The effectiveness of these strategies has been empirically verified in a wide range of studies [13, 1, 2, 14, 6, 7]. In spite of these pieces of work, transparent description or understanding about when they are useful or not would still be imcomplete. In particular, how class imbalance may affect the quality of feature learning would be an important problem in the context of representation learning in DNNs, but a thorough understanding of this issue is still missing. Recently, [2] reported an interesting observation that feature learning becomes better if no resampling is applied. More specifically, on the basis of their extensive experiment on visual recognition tasks using DNNs, they reported that the best classification performance was achieved when the whole network was first trained without any resampling and then only the last output layer (final classifier) was retrained with class-balanced resampling.

celo, classification, cross-entropy loss, (13 more...)

arXiv.org Machine Learning

2409.05598

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback