Goto

Collaborating Authors

 stddev




What Makes a "Good" Data Augmentation in Knowledge Distillation - A Statistical Perspective

Neural Information Processing Systems

Knowledge distillation (KD) is a general neural network training approach that uses a teacher model to guide the student model. Existing works mainly study KD from the network output side (e.g., trying to design a better KD loss function), while few have attempted to understand it from the input side. Especially, its interplay with data augmentation (DA) has not been well understood. In this paper, we ask: Why do some DA schemes (e.g., CutMix) inherently perform much better than others in KD? What makes a "good" DA in KD? Our investigation from a statistical perspective suggests that a good DA scheme should reduce the covariance of the teacher-student cross-entropy.




SamGoG: A Sampling-Based Graph-of-Graphs Framework for Imbalanced Graph Classification

Wang, Shangyou, Ding, Zezhong, Xie, Xike

arXiv.org Artificial Intelligence

Graph Neural Networks (GNNs) have shown remarkable success in graph classification tasks by capturing both structural and feature-based representations. However, real-world graphs often exhibit two critical forms of imbalance: class imbalance and graph size imbalance. These imbalances can bias the learning process and degrade model performance. Existing methods typically address only one type of imbalance or incur high computational costs. In this work, we propose SamGoG, a sampling-based Graph-of-Graphs (GoG) learning framework that effectively mitigates both class and graph size imbalance. SamGoG constructs multiple GoGs through an efficient importance-based sampling mechanism and trains on them sequentially. This sampling mechanism incorporates the learnable pairwise similarity and adaptive GoG node degree to enhance edge homophily, thus improving downstream model quality. SamGoG can seamlessly integrate with various downstream GNNs, enabling their efficient adaptation for graph classification tasks. Extensive experiments on benchmark datasets demonstrate that SamGoG achieves state-of-the-art performance with up to a 15.66% accuracy improvement with 6.7$\times$ training acceleration.


What Makes a "Good" Data Augmentation in Knowledge Distillation - A Statistical Perspective

Neural Information Processing Systems

Knowledge distillation (KD) is a general neural network training approach that uses a teacher model to guide the student model. Existing works mainly study KD from the network output side (e.g., trying to design a better KD loss function), while few have attempted to understand it from the input side. Especially, its interplay with data augmentation (DA) has not been well understood. In this paper, we ask: Why do some DA schemes (e.g., CutMix) inherently perform much better than others in KD? What makes a "good" DA in KD?


Conditional Generative Models for Contrast-Enhanced Synthesis of T1w and T1 Maps in Brain MRI

Piening, Moritz, Altekrüger, Fabian, Steidl, Gabriele, Hattingen, Elke, Steidl, Eike

arXiv.org Artificial Intelligence

Contrast enhancement by Gadolinium-based contrast agents (GBCAs) is a vital tool for tumor diagnosis in neuroradiology. Based on brain MRI scans of glioblastoma before and after Gadolinium administration, we address enhancement prediction by neural networks with two new contributions. Firstly, we study the potential of generative models, more precisely conditional diffusion and flow matching, for uncertainty quantification in virtual enhancement. Secondly, we examine the performance of T1 scans from quantitive MRI versus T1-weighted scans. In contrast to T1-weighted scans, these scans have the advantage of a physically meaningful and thereby comparable voxel range. To compare network prediction performance of these two modalities with incompatible gray-value scales, we propose to evaluate segmentations of contrast-enhanced regions of interest using Dice and Jaccard scores. Across models, we observe better segmentations with T1 scans than with T1-weighted scans.