Experience-Driven Exploration for Efficient API-Free AI Agents
Tang, Chenwei, Xing, Jingyu, Liu, Xinyu, Wang, Zizhou, Du, Jiawei, Zhen, Liangli, Lv, Jiancheng
–arXiv.org Artificial Intelligence
Most existing software lacks accessible Application Programming Interfaces (APIs), requiring agents to operate solely through pixel-based Graphical User Interfaces (GUIs). In this API-free setting, large language model (LLM)-based agents face severe efficiency bottlenecks: limited to local visual experiences, they make myopic decisions and rely on inefficient trial-and-error, hindering both skill acquisition and long-term planning. To address these challenges, we propose KG-Agent, an experience-driven learning framework that structures an agent's raw pixel-level interactions into a persistent State-Action Knowledge Graph (SA-KG). KG-Agent overcomes inefficient exploration by linking functionally similar but visually distinct GUI states, forming a rich neighborhood of experience that enables the agent to generalize from a diverse set of historical strategies. To support long-horizon reasoning, we design a hybrid intrinsic reward mechanism based on the graph topology, combining a state value reward for exploiting known high-value pathways with a novelty reward that encourages targeted exploration. This approach decouples strategic planning from pure discovery, allowing the agent to effectively value setup actions with delayed gratification. We evaluate KG-Agent in two complex, open-ended GUI-based decision-making environments (Civilization V and Slay the Spire), demonstrating significant improvements in exploration efficiency and strategic depth over the state-of-the-art methods.
arXiv.org Artificial Intelligence
Nov-4-2025
- Genre:
- Overview (1.00)
- Research Report > Promising Solution (0.34)
- Industry:
- Leisure & Entertainment > Games > Computer Games (0.68)
- Technology: