Revisiting Multi-Modal LLM Evaluation