stddev
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
What Makes a "Good" Data Augmentation in Knowledge Distillation - A Statistical Perspective
Knowledge distillation (KD) is a general neural network training approach that uses a teacher model to guide the student model. Existing works mainly study KD from the network output side (e.g., trying to design a better KD loss function), while few have attempted to understand it from the input side. Especially, its interplay with data augmentation (DA) has not been well understood. In this paper, we ask: Why do some DA schemes (e.g., CutMix) inherently perform much better than others in KD? What makes a "good" DA in KD? Our investigation from a statistical perspective suggests that a good DA scheme should reduce the covariance of the teacher-student cross-entropy.
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Mexico > Gulf of Mexico (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
SamGoG: A Sampling-Based Graph-of-Graphs Framework for Imbalanced Graph Classification
Wang, Shangyou, Ding, Zezhong, Xie, Xike
Graph Neural Networks (GNNs) have shown remarkable success in graph classification tasks by capturing both structural and feature-based representations. However, real-world graphs often exhibit two critical forms of imbalance: class imbalance and graph size imbalance. These imbalances can bias the learning process and degrade model performance. Existing methods typically address only one type of imbalance or incur high computational costs. In this work, we propose SamGoG, a sampling-based Graph-of-Graphs (GoG) learning framework that effectively mitigates both class and graph size imbalance. SamGoG constructs multiple GoGs through an efficient importance-based sampling mechanism and trains on them sequentially. This sampling mechanism incorporates the learnable pairwise similarity and adaptive GoG node degree to enhance edge homophily, thus improving downstream model quality. SamGoG can seamlessly integrate with various downstream GNNs, enabling their efficient adaptation for graph classification tasks. Extensive experiments on benchmark datasets demonstrate that SamGoG achieves state-of-the-art performance with up to a 15.66% accuracy improvement with 6.7$\times$ training acceleration.
What Makes a "Good" Data Augmentation in Knowledge Distillation - A Statistical Perspective
Knowledge distillation (KD) is a general neural network training approach that uses a teacher model to guide the student model. Existing works mainly study KD from the network output side (e.g., trying to design a better KD loss function), while few have attempted to understand it from the input side. Especially, its interplay with data augmentation (DA) has not been well understood. In this paper, we ask: Why do some DA schemes (e.g., CutMix) inherently perform much better than others in KD? What makes a "good" DA in KD?
Conditional Generative Models for Contrast-Enhanced Synthesis of T1w and T1 Maps in Brain MRI
Piening, Moritz, Altekrüger, Fabian, Steidl, Gabriele, Hattingen, Elke, Steidl, Eike
Contrast enhancement by Gadolinium-based contrast agents (GBCAs) is a vital tool for tumor diagnosis in neuroradiology. Based on brain MRI scans of glioblastoma before and after Gadolinium administration, we address enhancement prediction by neural networks with two new contributions. Firstly, we study the potential of generative models, more precisely conditional diffusion and flow matching, for uncertainty quantification in virtual enhancement. Secondly, we examine the performance of T1 scans from quantitive MRI versus T1-weighted scans. In contrast to T1-weighted scans, these scans have the advantage of a physically meaningful and thereby comparable voxel range. To compare network prediction performance of these two modalities with incompatible gray-value scales, we propose to evaluate segmentations of contrast-enhanced regions of interest using Dice and Jaccard scores. Across models, we observe better segmentations with T1 scans than with T1-weighted scans.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom (0.04)
- Europe > Germany (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)