HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception
–Neural Information Processing Systems
Model pre-training is essential in human-centric perception. In this paper, we first introduce masked image modeling (MIM) as a pre-training approach for this task. Upon revisiting the MIM training strategy, we reveal that human structure priors offer significant potential. Motivated by this insight, we further incorporate an intuitive human structure prior - human parts - into pre-training. Specifically, we employ this prior to guide the mask sampling process.
Neural Information Processing Systems
Jan-19-2025, 17:14:21 GMT
- Technology: