OS-HARM: ABenchmark for Measuring Safety of Computer Use Agents

Jun-16-2026, 18:14:29 GMT–Neural Information Processing Systems

Computer use agents are LLM-based agents that can directly interact with a graphical user interface, by processing screenshots or accessibility trees. While these systems are gaining popularity, their safety has been largely overlooked, despite the fact that evaluating and understanding their potential for harmful behavior is essential for widespread adoption. To address this gap, we introduce OS-HARM, a new benchmark for measuring safety of computer use agents. OS-HARM is built on top of the OSWorld environment (Xie et al., 2024) and aims to test models across three categories of harm: deliberate user misuse, prompt injection attacks, and model misbehavior.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Jun-16-2026, 18:14:29 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.45)

Genre:
- Workflow (0.93)
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Industry:
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Law Enforcement & Public Safety (0.67)
- Government > Regional Government
  - North America Government > United States Government (0.45)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence
    - Representation & Reasoning > Agents (1.00)
    - Natural Language > Large Language Model (1.00)
    - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found