Gradient-based Jailbreak Images for Multimodal Fusion Models