Bridging Domain Gaps between Pretrained Multimodal Models and Recommendations