Learning to Contextualize Web Pages for Enhanced Decision Making by LLM Agents

Lee, Dongjun, Lee, Juyong, Kim, Kyuyoung, Tack, Jihoon, Shin, Jinwoo, Teh, Yee Whye, Lee, Kimin

Mar-11-2025–arXiv.org Artificial Intelligence

Recent advances in large language models (LLMs) have led to a growing interest in developing LLM-based agents for automating web tasks. However, these agents often struggle with even simple tasks on real-world websites due to their limited capability to understand and process complex web page structures. In this work, we introduce LCoW, a framework for Learning language models to Contextualize complex Web pages into a more comprehensible form, thereby enhancing decision making by LLM agents. LCoW decouples web page understanding from decision making by training a separate contextualization module to transform complex web pages into comprehensible format, which are then utilized by the decision-making agent. We demonstrate that our contextualization module effectively integrates with LLM agents of various scales to significantly enhance their decision-making capabilities in web automation tasks. Notably, LCoW improves the success rates of closed-source LLMs (e.g., Gemini-1.5-flash, GPT-4o, Claude-3.5-Sonnet) by an average of 15.6%, and demonstrates a 23.7% average improvement in success rates for open-source LMs (e.g., Llama-3.1-8B, Llama-3.1-70B) on the WorkArena benchmark. Moreover, the Gemini-1.5-flash agent with LCoW achieves state-of-the-art results on the WebShop benchmark, outperforming human experts. The relevant code materials are available at our project page: https://lcowiclr2025.github.io.

agent, contextualization module, llm agent, (16 more...)

arXiv.org Artificial Intelligence

Mar-11-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Georgia > Fulton County > Atlanta (0.04)
- Europe > United Kingdom
  - England > Oxfordshire > Oxford (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found