VELA: An LLM-Hybrid-as-a-Judge Approach for Evaluating Long Image Captions

Open in new window