VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks

Open in new window