Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency

Open in new window