SupplementalMaterialforAdaptingSelf-Supervised VisionTransformersbyProbing Attention-ConditionedMaskingConsistency
–Neural Information Processing Systems
To compare thequality oftargetsamples being selected fortraining, wemeasure reliability precision (howmanyofthe selected target samples were actually predicted correctly?) We report expected calibration error (ECE [7]), lower is better. We separately visualize features before and after in-domain pretraining with MAE 7and DINO 8. Wenote that these features are completely selfsupervised as the model has not seen task labels yet. Regardless, we observe a small degree of taskdiscriminativeness (examples ofthesame class areclustered together) anddomain invariance (examples of the same class but different domains are close) before additional pretraining. We now measure the degree of label overlap between ImageNet-22K and these 3 benchmarks.
Neural Information Processing Systems
Feb-10-2026, 20:17:10 GMT
- Technology: