VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control