Goto

Collaborating Authors

 imagenet-100






Category-Extensible Out-of-Distribution Detection via Hierarchical Context Descriptions Supplementary Materials A Implementation Details

Neural Information Processing Systems

We also conduct empirical experiments to verify the effectiveness of those perturbations. As shown in Fig. A1, all of the perturbed text-features In addition, now that every perturbation can directly produce the description ( i.e., text-feature) of And the results are shown in Tab. OOD performance when the ID data is shifted. Table A2: Additionally improved ID accuracy on shifted datasets. Fig. A2, compared to the shifted ImageNet-A [ Sketch only preserve objects' shape and main texture, while the color information is totally vanished.



A.1 PyTorchpseudo-codeforMIRA Algorithm1PyTorchpseudo-codeofMIRA

Neural Information Processing Systems

In this subsection, we derive the necessary and sufficient condition in proposition??. Denote B,K be some natural numbers. We introduce the proposition from [8] that proves geometrical convergence of positive concave mapping. Bycorollary 2, g(v(n);Q) is a concave mapping. Wedonotapplyweightdecayanduse cosine scheduled the learning rate.



ProceduralImageProgramsforRepresentation Learning

Neural Information Processing Systems

Existing work focuses on ahandful ofcurated generativeprocesses which require expert knowledge to design, making it hard to scale up. To overcome this, we propose training with alarge dataset of twenty-one thousand programs, each one generating adiverse setofsynthetic images.


Uncertainty-Aware Dual-Student Knowledge Distillation for Efficient Image Classification

arXiv.org Artificial Intelligence

Department of Electrical Engineering Indian Institute of T echnology Bombay 21D070002 aakash.gore@iitb.ac.in Abstract--Knowledge distillation has emerged as a powerful technique for model compression, enabling the transfer of knowledge from large teacher networks to compact student models. However, traditional knowledge distillation methods treat all teacher predictions equally, regardless of the teacher's confidence in those predictions. This paper proposes an uncertainty-aware dual-student knowledge distillation framework that leverages teacher prediction uncertainty to selectively guide student learning. We introduce a peer-learning mechanism where two heterogeneous student architectures, specifically ResNet-18 and MobileNetV2, learn collaboratively from both the teacher network and each other . Experimental results on ImageNet-100 demonstrate that our approach achieves superior performance compared to baseline knowledge distillation methods, with ResNet-18 achieving 83.84% top-1 accuracy and MobileNetV2 achieving 81.46% top-1 accuracy, representing improvements of 2.04% and 0.92% respectively over traditional single-student distillation approaches. Deep neural networks have achieved remarkable success across various computer vision tasks, but their deployment on resource-constrained devices remains challenging due to high computational and memory requirements. This technique has become increasingly important as the demand for deploying sophisticated machine learning models on edge devices, mobile platforms, and embedded systems continues to grow. Traditional knowledge distillation approaches use a weighted combination of hard labels derived from ground truth annotations and soft labels generated by teacher predictions to train student networks.