Masked Images Are Counterfactual Samples for Robust Fine-tuning