Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks

Open in new window