macOSWorld: AMultilingual Interactive Benchmark for GUIAgents

Neural Information Processing Systems 

Graphical User Interface (GUI) agents show promising capabilities for automating computer-use tasks and facilitating accessibility, but existing interactive benchmarks are mostly English-only, covering web-use or Windows, Linux, and Android environments, but not macOS.