AITopics | pacmac

SupplementalMaterialforAdaptingSelf-Supervised VisionTransformersbyProbing Attention-ConditionedMaskingConsistency

Neural Information Processing SystemsFeb-10-2026, 20:17:10 GMT

To compare thequality oftargetsamples being selected fortraining, wemeasure reliability precision (howmanyofthe selected target samples were actually predicted correctly?) We report expected calibration error (ECE [7]), lower is better. We separately visualize features before and after in-domain pretraining with MAE 7and DINO 8. Wenote that these features are completely selfsupervised as the model has not seen task labels yet. Regardless, we observe a small degree of taskdiscriminativeness (examples ofthesame class areclustered together) anddomain invariance (examples of the same class but different domains are close) before additional pretraining. We now measure the degree of label overlap between ImageNet-22K and these 3 benchmarks.

artificial intelligence, initialization, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe > France (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

AdaptingSelf-SupervisedVisionTransformersby ProbingAttention-ConditionedMaskingConsistency

Neural Information Processing SystemsFeb-10-2026, 20:17:06 GMT

Similarly, self-supervised representation learning (SSL) is rapidly replacing supervised learning as the de-facto pretraining strategy for deep networks, due to improved scalability (unlabeled data is easier to collect) and generality (domain-specific SSL is often preferable to one-fits-all ImageNet pretraining [16,17]).

artificial intelligence, initialization, machine learning, (14 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency

Neural Information Processing SystemsDec-24-2025, 19:52:21 GMT

Visual domain adaptation (DA) seeks to transfer trained models to unseen, unlabeled domains across distribution shift, but approaches typically focus on adapting convolutional neural network architectures initialized with supervised ImageNet representations. In this work, we shift focus to adapting modern architectures for object recognition -- the increasingly popular Vision Transformer (ViT) -- initialized with modern pretraining based on self-supervised learning (SSL). Inspired by the design of recent SSL approaches based on learning from partial image inputs generated via masking or cropping -- either by learning to predict the missing pixels, or learning representational invariances to such augmentations -- we propose PACMAC, a two-stage adaptation algorithm for self-supervised ViTs. PACMAC first performs in-domain SSL on pooled source and target data to learn task-discriminative features, and then probes the model's predictive consistency across a set of partial target inputs generated via a novel attention-conditioned masking strategy, to identify reliable candidates for self-training. Our simple approach leads to consistent performance gains over competing methods that use ViTs and self-supervised initializations on standard object recognition benchmarks.

attention-conditioned masking consistency, name change, self-supervised vision transformer, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)

Add feedback

93b4d708976a1d9b1250c400e7fda811-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-17-2025, 01:40:47 GMT

artificial intelligence, initialization, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > France (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.71)
Information Technology > Sensing and Signal Processing > Image Processing (0.69)

Add feedback

93b4d708976a1d9b1250c400e7fda811-Paper-Conference.pdf

Neural Information Processing SystemsAug-17-2025, 01:40:43 GMT

artificial intelligence, domain adaptation, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Add feedback

Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency

Neural Information Processing SystemsJan-17-2025, 21:34:18 GMT

Visual domain adaptation (DA) seeks to transfer trained models to unseen, unlabeled domains across distribution shift, but approaches typically focus on adapting convolutional neural network architectures initialized with supervised ImageNet representations. In this work, we shift focus to adapting modern architectures for object recognition -- the increasingly popular Vision Transformer (ViT) -- initialized with modern pretraining based on self-supervised learning (SSL). Inspired by the design of recent SSL approaches based on learning from partial image inputs generated via masking or cropping -- either by learning to predict the missing pixels, or learning representational invariances to such augmentations -- we propose PACMAC, a two-stage adaptation algorithm for self-supervised ViTs. PACMAC first performs in-domain SSL on pooled source and target data to learn task-discriminative features, and then probes the model's predictive consistency across a set of partial target inputs generated via a novel attention-conditioned masking strategy, to identify reliable candidates for self-training. Our simple approach leads to consistent performance gains over competing methods that use ViTs and self-supervised initializations on standard object recognition benchmarks.

attention-conditioned masking consistency, pacmac, self-supervised vision transformer

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

pacmac

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

SupplementalMaterialforAdaptingSelf-Supervised VisionTransformersbyProbing Attention-ConditionedMaskingConsistency

AdaptingSelf-SupervisedVisionTransformersby ProbingAttention-ConditionedMaskingConsistency

Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency

93b4d708976a1d9b1250c400e7fda811-Supplemental-Conference.pdf

93b4d708976a1d9b1250c400e7fda811-Paper-Conference.pdf

Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency