GUI-Rise: Structured Reasoning and History Summarization for GUI Navigation

Liu, Tao, Wang, Chongyu, Li, Rongjie, Yu, Yingchen, He, Xuming, Song, Bai

Nov-3-2025–arXiv.org Artificial Intelligence

While Multimodal Large Language Models (MLLMs) have advanced GUI navigation agents, current approaches face limitations in cross-domain generalization and effective history utilization. We present a reasoning-enhanced framework that systematically integrates structured reasoning, action prediction, and history summarization. The structured reasoning component generates coherent Chain-of-Thought analyses combining progress estimation and decision reasoning, which inform both immediate action predictions and compact history summaries for future steps. Based on this framework, we train a GUI agent, \textbf{GUI-Rise}, through supervised fine-tuning on pseudo-labeled trajectories and reinforcement learning with Group Relative Policy Optimization (GRPO). This framework employs specialized rewards, including a history-aware objective, directly linking summary quality to subsequent action performance. Comprehensive evaluations on standard benchmarks demonstrate state-of-the-art results under identical training data conditions, with particularly strong performance in out-of-domain scenarios. These findings validate our framework's ability to maintain robust reasoning and generalization across diverse GUI navigation tasks. Code is available at https://leon022.github.io/GUI-Rise.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Nov-3-2025

arXiv.org PDF

Add feedback

Genre:
- Workflow (1.00)
- Research Report > New Finding (0.67)

Technology:
- Information Technology
  - Graphics (1.00)
  - Communications > Mobile (1.00)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Natural Language > Large Language Model (1.00)
    - Cognitive Science (1.00)
    - Machine Learning > Neural Networks
      - Deep Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found