AITopics

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Information Management (0.93)
(3 more...)

Neural Information Processing SystemsDec-25-2025, 12:47:38 GMT

Rethinking Semi-Supervised Imbalanced Node Classification from Bias-Variance Decomposition

This paper introduces a new approach to address the issue of class imbalance in graph neural networks (GNNs) for learning on graph-structured data. Our approach integrates imbalanced node classification and Bias-Variance Decomposition, establishing a theoretical framework that closely relates data imbalance to model variance. We also leverage graph augmentation technique to estimate the variance and design a regularization term to alleviate the impact of imbalance. Exhaustive tests are conducted on multiple benchmarks, including naturally imbalanced datasets and public-split class-imbalanced datasets, demonstrating that our approach outperforms state-of-the-art methods in various imbalanced scenarios. This work provides a novel theoretical perspective for addressing the problem of imbalanced node classification in GNNs.

artificial intelligence, machine learning, proceedings, (3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Neural Information Processing SystemsDec-24-2025, 08:57:06 GMT

Co-Modality Graph Contrastive Learning for Imbalanced Node Classification

Graph contrastive learning (GCL), leveraging graph augmentations to convert graphs into different views and further train graph neural networks (GNNs), has achieved considerable success on graph benchmark datasets. Yet, there are still some gaps in directly applying existing GCL methods to real-world data. First, handcrafted graph augmentations require trials and errors, but still can not yield consistent performance on multiple tasks. Second, most real-world graph data present class-imbalanced distribution but existing GCL methods are not immune to data imbalance. Therefore, this work proposes to explicitly tackle these challenges, via a principled framework called \textit{\textbf{C}o-\textbf{M}odality \textbf{G}raph \textbf{C}ontrastive \textbf{L}earning} (\textbf{CM-GCL}) to automatically generate contrastive pairs and further learn balanced representation over unlabeled data. Specifically, we design inter-modality GCL to automatically generate contrastive pairs (e.g., node-text) based on rich node content. Inspired by the fact that minority samples can be ``forgotten'' by pruning deep neural networks, we naturally extend network pruning to our GCL framework for mining minority nodes. Based on this, we co-train two pruned encoders (e.g., GNN and text encoder) in different modalities by pushing the corresponding node-text pairs together and the irrelevant node-text pairs away. Meanwhile, we propose intra-modality GCL by co-training non-pruned GNN and pruned GNN, to ensure node embeddings with similar attribute features stay closed.

co-modality graph contrastive learning, imbalanced node classification, textbf, (8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.81)

Zhu, Chaofan, Rui, Xiaobing, Wang, Zhixiao

GraphSB: Boosting Imbalanced Node Classification on Graphs through Structural Balance

arXiv.org Artificial IntelligenceNov-14-2025

Imbalanced node classification is a critical challenge in graph learning, where most existing methods typically utilize Graph Neural Networks (GNNs) to learn node representations. These methods can be broadly categorized into the data-level and the algorithm-level. The former aims to synthesize minority-class nodes to mitigate quantity imbalance, while the latter tries to optimize the learning process to highlight minority classes. However, neither category addresses the inherently imbalanced graph structure, which is a fundamental factor that incurs majority-class dominance and minority-class assimilation in GNNs. Our theoretical analysis further supports this critical insight. Therefore, we propose GraphSB (Graph Structural Balance), a novel framework that incorporates Structural Balance as a key strategy to address the underlying imbalanced graph structure before node synthesis. Structural Balance performs a two-stage structure optimization: Structure Enhancement that adaptively builds similarity-based edges to strengthen connectivity of minority-class nodes, and Relation Diffusion that captures higher-order dependencies while amplifying signals from minority classes. Thus, GraphSB balances structural distribution before node synthesis, enabling more effective learning in GNNs. Extensive experiments demonstrate that GraphSB significantly outperforms the state-of-the-art methods. More importantly, the proposed Structural Balance can be seamlessly integrated into state-of-the-art methods as a simple plug-and-play module, increasing their accuracy by an average of 3.67\%.

artificial intelligence, machine learning, node, (18 more...)

2511.10022

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Neural Information Processing SystemsOct-8-2025, 18:34:58 GMT

Rethinking Semi-Supervised Imbalanced Node Classification from Bias-Variance Decomposition

This paper introduces a new approach to address the issue of class imbalance in graph neural networks (GNNs) for learning on graph-structured data.

artificial intelligence, machine learning, variance, (16 more...)

Country:

North America > United States (0.14)
Asia > China (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

arXiv.org Artificial IntelligenceOct-22-2024

Large Language Model-based Augmentation for Imbalanced Node Classification on Text-Attributed Graphs

Wang, Leyao, Wang, Yu, Ni, Bo, Zhao, Yuying, Derr, Tyler

Node classification on graphs frequently encounters the challenge of class imbalance, leading to biased performance and posing significant risks in real-world applications. Although several data-centric solutions have been proposed, none of them focus on Text-Attributed Graphs (TAGs), and therefore overlook the potential of leveraging the rich semantics encoded in textual features for boosting the classification of minority nodes. Given this crucial gap, we investigate the possibility of augmenting graph data in the text space, leveraging the textual generation power of Large Language Models (LLMs) to handle imbalanced node classification on TAGs. Specifically, we propose a novel approach called LA-TAG (LLM-based Augmentation on Text-Attributed Graphs), which prompts LLMs to generate synthetic texts based on existing node texts in the graph. Furthermore, to integrate these synthetic text-attributed nodes into the graph, we introduce a text-based link predictor to connect the synthesized nodes with the existing nodes. Our experiments across multiple datasets and evaluation metrics show that our framework significantly outperforms traditional non-textual-based data augmentation strategies and specific node imbalance solutions. This highlights the promise of using LLMs to resolve imbalance issues on TAGs.

classification, large language model, natural language, (16 more...)

2410.16882

Country:

North America > United States > Oregon (0.04)
Asia > Singapore (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > Indonesia > Bali (0.04)

Genre:

Research Report (0.83)
Overview (0.53)

Industry: Information Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Neural Information Processing SystemsOct-11-2024, 09:12:20 GMT

Co-Modality Graph Contrastive Learning for Imbalanced Node Classification

Graph contrastive learning (GCL), leveraging graph augmentations to convert graphs into different views and further train graph neural networks (GNNs), has achieved considerable success on graph benchmark datasets. Yet, there are still some gaps in directly applying existing GCL methods to real-world data. First, handcrafted graph augmentations require trials and errors, but still can not yield consistent performance on multiple tasks. Second, most real-world graph data present class-imbalanced distribution but existing GCL methods are not immune to data imbalance. Therefore, this work proposes to explicitly tackle these challenges, via a principled framework called \textit{\textbf{C}o-\textbf{M}odality \textbf{G}raph \textbf{C}ontrastive \textbf{L}earning} (\textbf{CM-GCL}) to automatically generate contrastive pairs and further learn balanced representation over unlabeled data. Specifically, we design inter-modality GCL to automatically generate contrastive pairs (e.g., node-text) based on rich node content.

co-modality graph contrastive learning, imbalanced node classification, textbf, (7 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.43)

arXiv.org Artificial IntelligenceFeb-5-2024

Rethinking Semi-Supervised Imbalanced Node Classification from Bias-Variance Decomposition

Yan, Divin, Wei, Gengchen, Yang, Chen, Zhang, Shengzhong, Huang, Zengfeng

This paper introduces a new approach to address the issue of class imbalance in graph neural networks (GNNs) for learning on graph-structured data. Our approach integrates imbalanced node classification and Bias-Variance Decomposition, establishing a theoretical framework that closely relates data imbalance to model variance. We also leverage graph augmentation technique to estimate the variance, and design a regularization term to alleviate the impact of imbalance. Exhaustive tests are conducted on multiple benchmarks, including naturally imbalanced datasets and public-split class-imbalanced datasets, demonstrating that our approach outperforms state-of-the-art methods in various imbalanced scenarios. This work provides a novel theoretical perspective for addressing the problem of imbalanced node classification in GNNs.

classification, node, variance, (15 more...)

2310.18765

Country:

North America > United States (0.14)
Asia > China (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.66)

Industry: Information Technology (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

arXiv.org Artificial IntelligenceMay-3-2023

ImGCL: Revisiting Graph Contrastive Learning on Imbalanced Node Classification

Zeng, Liang, Li, Lanqing, Gao, Ziqi, Zhao, Peilin, Li, Jian

Graph contrastive learning (GCL) has attracted a surge of attention due to its superior performance for learning node/graph representations without labels. However, in practice, the underlying class distribution of unlabeled nodes for the given graph is usually imbalanced. This highly imbalanced class distribution inevitably deteriorates the quality of learned node representations in GCL. Indeed, we empirically find that most state-of-the-art GCL methods cannot obtain discriminative representations and exhibit poor performance on imbalanced node classification. Motivated by this observation, we propose a principled GCL framework on Imbalanced node classification (ImGCL), which automatically and adaptively balances the representations learned from GCL without labels. Specifically, we first introduce the online clustering based progressively balanced sampling (PBS) method with theoretical rationale, which balances the training sets based on pseudo-labels obtained from learned representations in GCL. We then develop the node centrality based PBS method to better preserve the intrinsic structure of graphs, by upweighting the important nodes of the given graph. Extensive experiments on multiple imbalanced graph datasets and imbalanced settings demonstrate the effectiveness of our proposed framework, which significantly improves the performance of the recent state-of-the-art GCL methods. Further experimental ablations and analyses show that the ImGCL framework consistently improves the representation quality of nodes in under-represented (tail) classes.

artificial intelligence, machine learning, representation, (16 more...)

2205.11332

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceApr-9-2023

Class-Imbalanced Learning on Graphs: A Survey

Ma, Yihong, Tian, Yijun, Moniz, Nuno, Chawla, Nitesh V.

In recent years, graph representation learning techniques have proven effective in discovering meaningful vector representations of nodes, edges, or entire graphs, resulting in successful applications across a wide range of downstream tasks [29, 52, 68]. However, graph data often presents a significant challenge in the form of class imbalance, where one class's instances significantly outnumber those of other classes. This imbalance can lead to suboptimal performance when applying machine learning techniques to graph data. Class-imbalanced learning on graphs (CILG) is an emerging research area addressing class imbalance in graph data, where traditional methods for non-graph data might be unsuitable or ineffective for several reasons. Firstly, graph data's unique, irregular, non-Euclidean structure complicates traditional class-imbalance techniques designed for Euclidean data [78]. Secondly, graph data often holds rich relational information, necessitating specialized techniques for preservation and leverage during the learning process [51]. Lastly, node dependencies and interactions in a graph make class re-balancing complex, as naïve oversampling or undersampling may disrupt the graph's structure and thus lead to poor performance [35].

artificial intelligence, machine learning, node, (14 more...)

2304.043

Country:

North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre:

Overview (1.00)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)