Enhance Modality Robustness in Text-Centric Multimodal Alignment with Adversarial Prompting