CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models

Open in new window