Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale

Open in new window