"Are We Done Yet?": A Vision-Based Judge for Autonomous Task Completion of Computer Use Agents

Open in new window