Slimmed Asymmetrical Contrastive Learning and Cross Distillation for Lightweight Model Training 1 Supplementary Material

Nov-18-2025, 22:34:08 GMT–Neural Information Processing Systems

In Section 3.2, we proposed the cross-distillation (XD) learning scheme. ImageNet-1K The encoders (MobileNet, EfficientNet, ResNet-50) are trained on ImageNet-1K with 100/200/300 epochs from scratch with the proposed method. We set the batch to 256 with a learning rate = 0.8. We employ the LARS optimizer with weight decay set to 1.5e-6. The hidden layer dimension of the projector is 4096.

artificial intelligence, machine learning, probability 0, (13 more...)

Neural Information Processing Systems

Nov-18-2025, 22:34:08 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Duplicate Docs Excel Report

Title
Slimmed Asymmetrical Contrastive Learning and Cross Distillation for Lightweight Model Training 1 Supplementary Material

Similar Docs Excel Report more

Title	Similarity	Source
None found