VLSU: Mapping the Limits of Joint Multimodal Understanding for AI Safety