Enhance Vision-Language Alignment with Noise