HarmonyGuard: Toward Safety and Utility in Web Agents via Adaptive Policy Enhancement and Dual-Objective Optimization

Chen, Yurun, Hu, Xavier, Liu, Yuhan, Yin, Keting, Li, Juncheng, Zhang, Zhuosheng, Zhang, Shengyu

Aug-7-2025–arXiv.org Artificial Intelligence

Large language models enable agents to autonomously perform tasks in open web environments. However, as hidden threats within the web evolve, web agents face the challenge of balancing task performance with emerging risks during long-sequence operations. Although this challenge is critical, current research remains limited to single-objective optimization or single-turn scenarios, lacking the capability for collaborative optimization of both safety and utility in web environments. To address this gap, we propose HarmonyGuard, a multi-agent collaborative framework that leverages policy enhancement and objective optimization to jointly improve both utility and safety. HarmonyGuard features a multi-agent architecture characterized by two fundamental capabilities: (1) Adaptive Policy Enhancement: We introduce the Policy Agent within HarmonyGuard, which automatically extracts and maintains structured security policies from unstructured external documents, while continuously updating policies in response to evolving threats. Extensive evaluations on multiple benchmarks show that HarmonyGuard improves policy compliance by up to 38% and task completion by up to 20% over existing baselines, while achieving over 90% policy compliance across all tasks. Web agents based on Large Language Models (LLMs) have transformed how we interact with the web by enabling autonomous tasks through natural language instructions (OpenAI, 2025; Anthropic, 2025). These agents can perform diverse operations, such as online shopping or booking flights, significantly expanding the scope of web automation. As these agents take on increasingly complex tasks, a critical question emerges: Can we trust web agents to act both intelligently and safely?

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Aug-7-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.67)
- Workflow (0.46)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.49)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found