BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking Mechanism
Wu, Qinzhuo, Gao, Pengzhi, Liu, Wei, Luan, Jian
–arXiv.org Artificial Intelligence
Graphical User Interface (GUI) agents have gained substantial attention due to their impressive capabilities to complete tasks through multiple interactions within GUI environments. However, existing agents primarily focus on enhancing the accuracy of individual actions and often lack effective mechanisms for detecting and recovering from errors. To address these shortcomings, we propose the BacktrackAgent, a robust framework that incorporates a backtracking mechanism to improve task completion efficiency. BacktrackAgent includes verifier, judger, and reflector components as modules for error detection and recovery, while also applying judgment rewards to further enhance the agent's performance. Additionally, we develop a training dataset specifically designed for the backtracking mechanism, which considers the outcome pages after action executions. Experimental results show that BacktrackAgent has achieved performance improvements in both task success rate and step accuracy on Mobile3M and Auto-UI benchmarks. Our data and code will be released upon acceptance.
arXiv.org Artificial Intelligence
May-28-2025
- Country:
- Asia > Thailand
- Europe > Italy
- North America > United States
- Florida > Miami-Dade County > Miami (0.04)
- Genre:
- Research Report
- Experimental Study (0.46)
- New Finding (0.34)
- Research Report
- Industry:
- Information Technology (0.67)
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning > Neural Networks (0.68)
- Natural Language > Large Language Model (0.94)
- Representation & Reasoning > Agents (1.00)
- Communications (1.00)
- Graphics (0.94)
- Information Management (0.93)
- Artificial Intelligence
- Information Technology