AITopics | auxiliary classifier

HYDRA-FL: Hybrid Knowledge Distillation for Robust and Accurate Federated Learning

Neural Information Processing SystemsFeb-14-2026, 06:23:19 GMT

Data heterogeneity among Federated Learning (FL) users poses a significant challenge, resulting in reduced global model performance. The community has designed various techniques to tackle this issue, among which Knowledge Distillation (KD)-based techniques are common. While these techniques effectively improve performance under high heterogeneity, they inadvertently cause higher accuracy degradation under model poisoning attacks (known as attack amplification). This paper presents a case study to reveal this critical vulnerability in KD-based FL systems. We show why KD causes this issue through empirical evidence and use it as motivation to design a hybrid distillation technique. We introduce a novel algorithm, Hybrid Knowledge Distillation for Robust and Accurate FL (HYDRA-FL), which reduces the impact of attacks in attack scenarios by offloading some of the KD loss to a shallow layer via an auxiliary classifier.

accuracy, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Virginia (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

750046157471c56235a781f2eff6e226-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 20:37:12 GMT

backbone, bias-conflicting sample, classifier, (16 more...)

Neural Information Processing Systems

Country: Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

750046157471c56235a781f2eff6e226-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 20:37:08 GMT

However, ERM has been known to cause a learned classifier to be biased toward spurious correlations between predefined classes and latent attributes that appear in a majority of training data [12].

artificial intelligence, classifier, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Task-Oriented Feature Distillation

Neural Information Processing SystemsFeb-9-2026, 17:57:09 GMT

Hinton et al. first propose the concept of distillation, where a lightweight student model

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

4ca82782c5372a547c104929f03fe7a9-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 13:54:50 GMT

nasty teacher, skeptical student, student, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Asia (0.04)

Genre: Research Report (0.46)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

Twin Auxilary Classifiers GAN

Neural Information Processing SystemsDec-25-2025, 08:51:15 GMT

Conditional generative models enjoy significant progress over the past few years. One of the popular conditional models is Auxiliary Classifier GAN (AC-GAN) that generates highly discriminative images by extending the loss function of GAN with an auxiliary classifier. However, the diversity of the generated samples by AC-GAN tends to decrease as the number of classes increases. In this paper, we identify the source of low diversity issue theoretically and propose a practical solution to the problem. We show that the auxiliary classifier in AC-GAN imposes perfect separability, which is disadvantageous when the supports of the class distributions have significant overlap. To address the issue, we propose Twin Auxiliary Classifiers Generative Adversarial Net (TAC-GAN) that adds a new player that interacts with other players (the generator and the discriminator) in GAN. Theoretically, we demonstrate that our TAC-GAN can effectively minimize the divergence between generated and real data distributions. Extensive experimental results show that our TAC-GAN can successfully replicate the true data distributions on simulated data, and significantly improves the diversity of class-conditional image generation on real datasets.

electronic proceedings, name change, twin auxilary classifier gan, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Information-Theoretic Greedy Layer-wise Training for Traffic Sign Recognition

Lyu, Shuyan, Wu, Zhanzimo, Du, Junliang

arXiv.org Artificial IntelligenceNov-3-2025

Modern deep neural networks (DNNs) are typically trained with a global cross-entropy loss in a supervised end-to-end manner: neurons need to store their outgoing weights; training alternates between a forward pass (computation) and a top-down backward pass (learning) which is biologically implausible. Alternatively, greedy layer-wise training eliminates the need for cross-entropy loss and backpropagation. By avoiding the computation of intermediate gradients and the storage of intermediate outputs, it reduces memory usage and helps mitigate issues such as vanishing or exploding gradients. However, most existing layer-wise training approaches have been evaluated only on relatively small datasets with simple deep architectures. In this paper, we first systematically analyze the training dynamics of popular convolutional neural networks (CNNs) trained by stochastic gradient descent (SGD) through an information-theoretic lens. Our findings reveal that networks converge layer-by-layer from bottom to top and that the flow of information adheres to a Markov information bottleneck principle. Building on these observations, we propose a novel layer-wise training approach based on the recently developed deterministic information bottleneck (DIB) and the matrix-based Rényi's $α$-order entropy functional. Specifically, each layer is trained jointly with an auxiliary classifier that connects directly to the output layer, enabling the learning of minimal sufficient task-relevant representations. We empirically validate the effectiveness of our training procedure on CIFAR-10 and CIFAR-100 using modern deep CNNs and further demonstrate its applicability to a practical task involving traffic sign recognition. Our approach not only outperforms existing layer-wise training baselines but also achieves performance comparable to SGD.

artificial intelligence, information, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.27651

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

5a5ddf0ab751861025c00700093c5677-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 03:34:08 GMT

accuracy, distillation, hydra-fl, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Virginia (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
(3 more...)

Add feedback

f14bc21be7eaeed046fed206a492e652-Paper.pdf

Neural Information Processing SystemsAug-17-2025, 05:50:28 GMT

artificial intelligence, hypersphere, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Learning Debiased Classifier with Biased Committee

Neural Information Processing SystemsAug-15-2025, 23:29:29 GMT

We mark the adopted value in bold . We study the impact of training randomly sampled subset, learning by committee, and transferring the knowledge of the main classifier. We mark the best performance in bold . We mark the best performance in bold . We extend Table 5 of the main paper.

artificial intelligence, classifier, machine learning, (16 more...)

Neural Information Processing Systems

Country: Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback