Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-Training