Beyond Single Frames: Can LMMs Comprehend Temporal and Contextual Narratives in Image Sequences?

Open in new window