Coordinated Robustness Evaluation Framework for Vision-Language Models