SO-Bench: A Structural Output Evaluation of Multimodal LLMs