ScImage: How Good Are Multimodal Large Language Models at Scientific Text-to-Image Generation?