AI Safety in Practice: Enhancing Adversarial Robustness in Multimodal Image Captioning