SemMAE: Semantic-Guided Masking for Learning Masked Autoencoders

Oct-11-2024, 05:27:25 GMT–Neural Information Processing Systems

Recently, significant progress has been made in masked image modeling to catch up to masked language modeling. However, unlike words in NLP, the lack of semantic decomposition of images still makes masked autoencoding (MAE) different between vision and language. In this paper, we explore a potential visual analogue of words, i.e., semantic parts, and we integrate semantic information into the training process of MAE by proposing a Semantic-Guided Masking strategy. Compared to widely adopted random masking, our masking strategy can gradually guide the network to learn various information, i.e., from intra-part patterns to inter-part relations. In particular, we achieve this in two steps.

learning masked autoencoder, semantic-guided masking, semmae, (4 more...)

Neural Information Processing Systems

Oct-11-2024, 05:27:25 GMT

Conferences Web Page

Add feedback

Country:
- Asia > China > Heilongjiang Province > Daqing (0.08)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)