Slimmed Asymmetrical Contrastive Learning and Cross Distillation for Lightweight Model Training 1 Supplementary Material

Apr-28-2026, 20:48:13 GMT–Neural Information Processing Systems

In Section 3.2, we proposed the crossdistillation (XD) learning scheme. The distillation objective in Eq (10) is the inner decorrelation minimization between embeddings z and [ z]. In addition to the correlation-based distillation loss, we also investigate the negative logarithm(e.g, To avoid the unbalanced loss magnitude, the distillation loss is introduced as the regularization term controlled by the penalty level γ: L = LSACL(zA,zB)+γLCD (1) LCD = ( [ zA]logzA + [ zB]logzB)/2 (2) We empirically observe that the negative logarithm-based distillation loss failed to outperform the proposed cross-distillation loss LCD with inner-decorrelation minimization. As shown in the ImageNet-100 results below: Method Encoder # of Params (M) Linear Eval Acc.

artificial intelligence, machine learning, probability 0, (13 more...)

Neural Information Processing Systems

Apr-28-2026, 20:48:13 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Duplicate Docs Excel Report

Title
Slimmed Asymmetrical Contrastive Learning and Cross Distillation for Lightweight Model Training 1 Supplementary Material

Similar Docs Excel Report more

Title	Similarity	Source
None found